Simple Bio Upper Ontology
Alan Rector, Robert Stevens, Jeremy Rogers
and the CO-ODE and BioHealth Informatics Teams
School of Computer Science
University of Manchester
Manchester M13 9PL, England
January 2006
I An experimental simple upper ontology for biomedicine
The notes below describe the purpose of
this OWL implementation of a simple upper ontology for
biomedicine. All this is work in progress and intended stimulate
discussion. Many users will already be in discussion groups. Or mail me
at rector@cs.man.ac.uk.
Please put "BIO-ONT:" in subject line (automatic if you click here.)
II General Consideations
We factor ontology implementations into
two parts:
- The upper ontology or top ontology - which consists of
the basic abstract categories and the major relations that link
them (properties in OWL).
- The domain ontology
that contains all the domain specific concepts. It is often
useful to separate the domain ontology into the:
- Top domain ontology -
the major domain categories that hook directly to the upper ontology -
e.g. in biology "Cell", "Organism", "Body Part", "Organ", "Tissue"
etc. The Top domain ontology
usually includes further basic relations/properties and constraints on
their use, and may further constrain the use of the relations from the upper ontology.
- The domain ontology proper -
the bulk of the entities to be represented - e.g. "Blood cell",
"Mouse", "Limb", "Liver", "Liver parenchyma", etc.
We recommend implementing
domain ontologies around a skeleton
of disjoint trees of primitives using the methods for normalisation
described in
Modularisation
of domain ontologies implemented in description logicsand related
formalisms including OWL.
We recommend implementing upper ontologies as a series of
dichotomies following a twenty questions model as shown below.
The purpose of these demonstrations is to provide a simple upper
ontology suitable for biomedicine and show how the
Top domain ontology for biomedicine
links to it.
III Background - an engineering approach to upper ontologies
The
simple-bio-upper-ontology
is described below. This section gives a bit of motivation.
If you just want to download the ontology, you may skip this part and
go straight to the
implementation and download.
Upper ontologies are
different from domain ontologies. A great deal of time and effort
can be expended on
upper ontologies, and some even doubt their effectiveness. We
strongly advise that an upper ontology is NOT the place to start
developing an ontology. Work out what you need in your domain
first. However, the distinctions in upper ontologies are
important to all but the simplest ontologies. Using a suitable
upper ontology can cut the time and effort to build an ontology and
avoid simple mistakes. If you plan to share an ontology, basing it on a
sound upper ontology will help avoid simpel mistakes and make the
ontology more likely to be re-usable.
This upper ontology is intended to serve three purposes:
- It provides a set of basic relations and the classes to express
the constraints on their use
- It provides a starting point and helps with basic distinctions
- It provides a common vocabulary and point of attachment for the top domain ontology.
This upper ontology is meant to be light weight and easy to use.
It is tuned to biology, and the biology module adds specific biological
concepts and the advanced relations model likewise contains properties
motivated by biological examples. five basic principles have
governed its construction:
- No distinction without a
difference - all classes in the upper ontology should be
motivated by constraints and inferences that can be made from them, in
particular most are either the domain or range of some property
(relation)
- The twenty questions approach:
Membership in classes should be determined by a series of simple,
intuitive question
- Deferred commitment -
decisions (ontological commitments) should be deferred whenever
possible until they are actually required. Otherwise known
as "Do only waht you have to".
- Copy rather than invent
- where possible, we have used names and
notions from others, particularly DOLCE.
- Implementable in OWL-DL
- the entire ontology is to be implemented in OWL-DL and subject to
inference using standard reasoners. Inevitably this means
that some constraints cannot be expressesd and some notions are
under-specified.
Because the upper ontology is implemented as a series of dichotomies, a
series of yes-no questions as to which branch of the dichotomy to
follow should suffice to place each item in the top domain ontology.
Once these are placed, the rest should follow.
In principle, it ought to be possible to determine the location of any
item in the top domain ontology by a game of 'twenty questions'.
Often determining the top ontology categories for classes is sufficient
to identify which properties can hold between them. When there
are several, as in the relations between processes and things or for
describing the different ways in which one thing can be part of
another, then a second game of 'twenty questions' should be sufficient
to determine which to use.
The methodology is described loosely in the papers in
III below and in
the annotations of the Simple Bio Top Ontology which follows. (A
set of teaching slides relating to earlier versions of the ontology can
also be found
here
and there are a series of
papers describing the methodology). More detailed
discussions are in preparation.
Comments and suggestions are welcome to
rector@cs.man.ac.uk.
Please put "BIO-ONT:" in subject line. Formulating good
questions is particularly difficult, so comments and suggestions are
particularly welcome.
IV
Simple Bio Upper Ontology - OWL implementation and downloads
The OWL models are intended to be self documenting, with extensive
comments on most classes including definitions and the key
questions.
Unfortunately, it is not possible to control the order of presentation
of the classes, so the dichotomies that form the background have to be
recognised from the names. Otherwise, we hope that the intentions
are clear from the comments. If not, it is a good topic for
discussion.
The factored ontology is somewhat harder to handle but easier to extend
smoothly.
The factored ontology can be downloaded as a zip file from
http://www.cs.man.ac.uk/~rector/ontologies/simple-top-bio/simple-top-bio-factored.zip
The modules all load with Protege-OWL3.2 beta build 304 (for
downloading and installation instructions see
below.) In
order to make sense of them, they must be classified. We commonly
use Fact++ or Racer 1.7 or FaCT++, but Pellet should also work.
There is a 'boot' ontology which simply includes all other
ontologies. Alternatively, any of the ontologies should load
individually. In general it is better to load from the OWL
file. Once loaded and the imports are resolved for your machine,
save the file. Subsequent loads can be from the .pprj file.
Unresolved imports - It is not an
error! All these ontologies will require establishing an
ontology repository in the file in which they have been
unzipped. When the rather nasty pop-up appears saying
"unresolved import", click "Add Repository" and then choose
"Local folder" and navigate to the folder where you unzipped the OWL
files, normally the same folder from which you opened the main OWL
file. (Yes, we'll make this smoother Real Soon Now.)
The ontology is factored into a series of modules starting from
'very-top'.
very-top - A few very
general categories
top-self-standing
- The heart of the real top ontology
additiona-self-standing
- Some additional things that may be more controversial
refining-entities-and-properties -
the key modifiers
quantities - a very basic ontology
of quantities sufficient for demo only
basic-substances - the basic
notions of substances including water for demonstrations
vertebrate-gross-anatomy - a very
top bit of gross anatomy, almost compiant with the FMA
cells - a very basic notions
of cells to provide Red Blood Cell for demonstration with collectives
normality - the GALEN model of
normal and nonNormal, patholological and non-pathological
sequences - a demonstration of
using lists in OWL for sequences.
collectives - a demonstration of
collectives and mixtures
basic-body-substances - a
demonstration of the use of mixtures and collectives for blood in OWL
samples - samples and experiments
as shown in the PSB poster. (missing in this first release)
BEWARE. The imports mechanism is still a bit fragile. If
you edit anything, be sure you have made a copy of the entire folder
first. DO NOT change the names in any file that is imported by
another file. OWL uses only names (URIs actually) as its
references between modules. Change one, and you may break a
reference, after which the importing file will not load in
Protege . (More robust behaviour when things
aren't found and an implementation that transparently uses anonymous
IDs and names as labels are under development.)
An older less well commented version of the ontology can be found at
Unfactored
Mini-top-bio-with-demo-entities.
V Background and
Papers describing the
ontology
This ontology began as a development of the reconstruction of the GALEN
upper ontology and an attempt to reconcile it with Guarino and Welty's
DOLCE
and the Digital Anatomist Foundational Model of Anatomy (
FMA) with
considerations of Smith et al's
BFO. It
also includes the standard GALEN scheme for "Normal/NonNormal" and
'Pathological/NonPathological" and an implementation of the notion of
"Collectives" described in the Rome ontologies meeting. (PowerPoint
here
draft paper
here.
(The journal version is about to appear and will be linked as soon as
it does.).
The paper “Patterns, Properties and Minimizing Commitment:
Reconstruction of the GALEN Upper Ontology in OWL",
Alan L Rector and Jeremy Rogers, Core
Ontologies
Workshop (CORONT) in conjunction with the European Knowledge
Acquisition
Workshop(EKAW-2004), Northampton, UK is located at galen-top-reconstructed-rector-rogers.pdf.
The slides from the presentation with a good example of the "twenty
questions" approach to placing top domain entities is at
EKAW-GALEN-Upper-Ont-Reconstructed.ppt.
The OWL model of the ontology is intended to be self documenting, with
extensive comments on the critical sections including critical
questions.
The notion of using a Twenty Questions approach was suggested by Robert
Stevens and resulted in the short poster and has proved popular with
students. A brief summary is given in the PSB poster available
here
VI Downloading
tools to view the ontology
Protege can be obtained from
http://protege.stanford.edu.
Be sure to download the latest beta, currently 3.2 build 304.
Imports do not work in older
versions! You are also advised to donwload a copy of the
Manchester Syntax Editor, Unit Testing Framework, Debugger, List
Wizard, and OwlDoc from
http://www.co-ode.org.
You will also need a copy of a classifier. You must set the port
in OWL-->
Preferences-->Reasoner-->URL to the port corresponding to
the classifier installed. Racer listens on 8080, the
default. FaCT++ listens on 3490. (These can be changed as
you wish - see documentation for each reasoner.)
FaCT++ is available from
http://owl.man.ac.uk/factplusplus/.
Racer is available from
http://www.fh-wedel.de/%7Emo/racer/
Pellet is available from
http://www.mindswap.org/2003/pellet/
For OwlViz you will need a recent copy of the GraphViz tools available
from
http://www.research.att.com/sw/tools/graphviz/
This file
This file should be located at http://www.cs.man.ac.uk/~rector/ontologies/sample-top-bio/
and/or at http://www.co-ode.org/ontologies/simple-top-bio/
Acknowledgements
Many ideas in this ontology come from Guarino and Welty's DOLCE
and from Smith et
al's BFO. However,
they should not be held responsible for anything here. The work
is based on experience in developing the GALEN ontology which was largely
constructed by Jeremy
Rogers but involved contributions from many members of a large
team. The tools and teaching experience come from the CO-ODE team where many further
resources can be found. The Protege-OWL tools have been developed
in collaboration between the Stanford
Medical Informatics and CO-ODE.
Protege itself was developed at Stanford and has a long history and
extensive user group - see protege.stanford.edu.