An Outline Proposal for a Concrete Data Type Extension to DAML+OIL

Ian Horrocks and Peter F. Patel-Schneider

This is only a brief outline, and is not intended as a rigorous specification.

The idea is to extend daml+oil with arbitrary concrete data types from the XML Schema type system (http://www.w3.org/TR/xmlschema-2/#typesystem), while still retaining the desirable properties of the ontology language, in particular its (relative) simplicity and its well defined semantics. This is achieved by maintaining a clear separation between instances of "abstract" classes (those defined using our ontology language) and instances of concrete data types (defined using the XML Schema type system). In particular, it is assumed that the domain of interpretation of abstract classes is disjoint from the domain of interpretation of concrete data types, so that an instance of an abstract class (e.g., leo the Lion) can never have the same interpretation as a value of a concrete data type (e.g., the integer 5), and that the set of abstract properties (which map individuals to individuals) is disjoint from the set of concrete properties (which map individuals to concrete values).

The disjointness of abstract and concrete domains is motivated by both philosophical and pragmatic considerations:

  1. Concrete data types are considered to be already sufficiently structured by the built-in predicates; therefore, it is not appropriate to form new classes of concrete values using the ontology language.
  2. The simplicity and compactness of the ontology language are not compromised; even enumerating all the XML Schema data types would add greatly to its complexity, while adding a theory for each data type, even if it were possible, would lead to a language of monumental proportions.
  3. The semantic integrity of the language is not compromised; defining theories for all the XML Schema data types would be difficult or impossible without extending the language in directions whose semantics may be difficult to capture in the existing framework.
  4. The "implementatibility" of the language is not compromised; applications (including reasoners) can simply exploit the services of XML Schema type checker/validater (assuming that such a component exists, or soon will exist).
As already stated, the disjointness of the concrete and abstract domains means that no object can ever be an instance of both an abstract class and a concrete type. However, instances of an abstract class can be associated with instances of concrete data types (concrete data values) via concrete properties. For example, leo the Lion may be associated with the integer value 5 via the age property. Moreover, we can extend the language so that we can constrain the cardinality and range of concrete properties (both globally and locally) with concrete data types. This would allow us, for example, to assert that all Animals are associated with a nonNegativeInteger via the age property.

The proposal is to add a new class "ConcreteProperty", instances of which would be concrete properties, and to extend restrictions with the addition of toDataType, hasDataType, hasDataTypeQ, hasDataValue and onConcreteProperty properties. The range of the first three properties is rdfs:literal, which is assumed to be the union of all possible types (i.e., the "top" of the type hierarchy).  The range of onConcreteProperty is ConcreteProperty. It would also be possible to specify a data type, as the range of a concrete property.

From a theoretical point of view, this means that the ontology language can specify the existence of a value that is an instance of one or more concrete data types. However, as concrete values can never be instances of abstract classes, they cannot apply additional constraints to objects in the abstract domain: we cannot, for example, assert that all objects associated with the integer value 5 via the hasAge property (the inverse of the age property) must be instances of the class Giraffe. This facilitates extensibility, allowing the concrete type system to be extended without having any impact on the abstract class (ontology) language, and vice versa.  Similarly, reasoning components can be independently developed and trivially combined to give a hybrid reasoner whose properties are determined by those of the two components; in particular, the combined reasoner will be sound and complete if both components are sound and complete.

From a practical point of view, daml+oil implementations can choose to support some or all of the XML Schema data types. For supported data types, they can either implement their own type checker/validater or rely on some external component. Non-supported data types can either be trapped as an error or ignored.

The job of a type checker/validater is to take zero or more data values and one or more data types (all values are at least instances of rdfs:literal), and determine if there exists any data value that is equal to every one of the specified data values and is an instance of every one of the specified data types.

Resources:

  • http://www.cs.man.ac.uk/~horrocks/DAML+OIL/Datatypes/daml+oil+concrete.html.
  • http://www.cs.man.ac.uk/~horrocks/DAML+OIL/Datatypes/daml+oil+concrete-ex.html.
  • http://www.cs.man.ac.uk/~horrocks/DAML+OIL/Datatypes/semantics-concrete.html.

  •