DAML+OIL+DT: A Datatype Extension to DAML+OIL

Feedback to www-rdf-logic, please.

DAML+OIL+DT version (revision 4.1): Ian Horrocks, Frank van Harmelen and Peter Patel-Schneider, editors.

The idea behind daml+oil+dt is to extend daml+oil with arbitrary datatypes from the XML Schema type system (http://www.w3.org/TR/xmlschema-2/#typesystem), while still retaining the desirable properties of the ontology language, in particular its (relative) simplicity and its well defined semantics. This is achieved by maintaining a clear separation between instances of "abstract" classes (those defined using our ontology language) and instances of datatypes (defined using the XML Schema type system). In particular, it is assumed that the domain of interpretation of abstract classes is disjoint from the domain of interpretation of datatypes, so that an instance of an abstract class (e.g., Leo the Lion) can never have the same interpretation as a value of a datatype (e.g., the integer 5), and that the set of abstract properties (which map individuals to individuals) is disjoint from the set of datatype properties (which map individuals to datatype values).

The disjointness of abstract and datatype domains is motivated by both philosophical and pragmatic considerations:

Datatypes are considered to be already sufficiently structured by the built-in predicates; therefore, it is not appropriate to form new classes of datatype values using the ontology language.
The simplicity and compactness of the ontology language are not compromised; even enumerating all the XML Schema datatypes would add greatly to its complexity, while adding a theory for each datatype, even if it were possible, would lead to a language of monumental proportions.
The semantic integrity of the language is not compromised; defining theories for all the XML Schema datatypes would be difficult or impossible without extending the language in directions whose semantics may be difficult to capture in the existing framework.
The "implementatibility" of the language is not compromised; applications (including reasoners) can simply exploit the services of XML Schema type checker/validater (assuming that such a component exists, or soon will exist).

As already stated, the disjointness of the datatype and abstract domains means that no object can ever be an instance of both an abstract class and a datatype. However, instances of an abstract class can be associated with instances of datatypes (data values) via datatype properties. For example, Leo the Lion may be associated with the integer value 5 via the age property. Moreover, we have extend the language so that we can constrain the cardinality and range of datatype properties (both globally and locally) with datatypes. This allows us, for example, to assert that all Animals are associated with a nonNegativeInteger via the age property.

The proposal is to add new classes "daml:Class" and "daml:Datatype"; instances of the former are abstract classes while instances of the latter are datatypes. Similarly, we have the classes "daml:AbstractProperty" and "DatatypeProperty", instances of which are abstract and datatype properties respectively.

From a theoretical point of view, this means that the ontology language can specify the existence of a value that is an instance of one or more datatypes. However, as data values can never be instances of abstract classes, they cannot apply additional constraints to objects in the abstract domain: we cannot, for example, assert that all objects associated with the integer value 5 via the hasAge property (the inverse of the age property) must be instances of the class Giraffe. This facilitates extensibility, allowing the type system to be extended without having any impact on the abstract class (ontology) language, and vice versa. Similarly, reasoning components can be independently developed and trivially combined to give a hybrid reasoner whose properties are determined by those of the two components; in particular, the combined reasoner will be sound and complete if both components are sound and complete.

From a practical point of view, daml+oil implementations can choose to support some or all of the XML Schema datatypes. For supported data types, they can either implement their own type checker/validater or rely on some external component. Non-supported data types can either be trapped as an error or ignored.

The job of a type checker/validater is to take zero or more data values and one or more datatypes, and determine if there exists any data value that is equal to every one of the specified data values and is an instance of every one of the specified data types.

$Revision: 4.1$ of $Date: 2001/02/26$.