Next: Knowledge representation languages
Up: Building an Ontology
Previous: Building an Ontology
The Development Lifecycle
Methodologies broadly divide into those that are stage-based (e.g. TOVE [21]) and those that rely on
iterative evolving prototypes (e.g. Methontology [25]). These are in fact complementary techniques.
Most distinguish between an informal stage, where the ontology is sketched out using either natural language descriptions or some diagram technique, and a formal stage
where the ontology is encoded in a formal knowledge representation language, that is
machine computable. As an ontology should ideally be communicated to people and unambiguously
interpreted by software, the informal representation helps the former and the formal the latter.
Figures 1 and 2 represents a skeletal
methodology and life-cycle for building ontologies, inspired by the software
engineering V-process model [26]. The left side of the V charts the
processes in building an ontology and the right side charts the guidelines,
principles and evaluation used to `quality assure' the ontology.
The overall process, however, moves through a life-cycle, as depicted in
Figure 2.
Figure 1:
The V-model inspired methodology for building ontologies.
|
Figure 2:
The ontology building life-cycle.
|
The stages in the V-process model and life-cycle are:
- Identify purpose and scope:
- developing a requirements specification
for the ontology by identifying the intended scope and purpose of the ontology.
A well-characterised requirements specification is important
to the design, evaluation and re-use of an ontology. It can be seen from
Section 4 that the use to which an ontology is put has a great effect on
the content and style of that ontology.
- Knowledge Acquisition:
- the process of acquiring domain knowledge from
which the ontology will be built. Sources span the complete range of knowledge
holders: Specialist biologists; database metadata; standard text books; research papers and other
ontologies. Motivating scenarios are collected and informal competency questions formed
[21] - these are informal questions that the ontology must be
able to answer and will be used to check that the ontology is fit for purpose.
The EcoCyc and RiboWeb ontologies had the bulk of their knowledge
gathered from the research literature on E. coli. metabolism and ribosomal
structure respectively. In the former case this was a huge volume of
material, which took many years to process. The TaO, being built to query
databases, extracted a large part of its knowledge from database documentation.
Standard texts also contributed to the knowledge of core molecular biology.
- Conceptualisation:
- identifying the key concepts
that exist in the domain, their properties and the relationships that hold
between them; identifying natural language terms to refer to such concepts,
relations and attributes; and structuring domain knowledge into explicit conceptual
models. This is the process touched upon in Section 2, where the
concepts and relationships describing the domain are captured. The ontology is
usually described using some informal terminology.
Gruber [6] suggests writing lists of the concepts to be contained
within the ontology and exploring other ontologies to re-use all or part of
their conceptualisations and terminologies. At this stage it is important to
bear the results of the first step, that of requirements gathering, in mind.
- Integrating:
- use or specialise an existing
ontology: a task frequently hindered by the inadequate documentation of existing
ontologies, notably their implicit assumptions. Using a generic
ontology, such as MBO, or [27,28] gives a deeper definition
of the concepts in the chosen domain.
- Encoding:
- representing the conceptualisation in
some formal language, e.g. frames, object models or logic. This includes the
creation of formal competency questions in terms of the terminological specification
language chosen (usually first order logic). The representation of
ontologies is explored further below.
- Documentation:
- informal and formal complete definitions, assumptions and
examples are essential to promote the appropriate use and re-use of an ontology.
Documentation is important for defining, more expansively than is possible
within the ontology, the exact meaning of terms within the ontology.
- Evaluation:
- determining the appropriateness of an ontology for its
intended application. Evaluation is done pragmatically, by assessing the
competency of the ontology to satisfy the requirements of its application,
including determining the consistency, completeness and conciseness of an
ontology [25]. Conciseness implies an absence of redundancy in
the definitions of an ontology and an appropriate granularity. For example, an
ontology that modelled protein molecules at the atomic resolution when the amino
acid level would suffice would not be considered concise.
Next: Knowledge representation languages
Up: Building an Ontology
Previous: Building an Ontology
Robert Stevens
2001-07-19