The Development Lifecycle

Next: Knowledge representation languages Up: Building an Ontology Previous: Building an Ontology

The Development Lifecycle

Methodologies broadly divide into those that are stage-based (e.g. TOVE [21]) and those that rely on iterative evolving prototypes (e.g. Methontology [25]). These are in fact complementary techniques. Most distinguish between an informal stage, where the ontology is sketched out using either natural language descriptions or some diagram technique, and a formal stage where the ontology is encoded in a formal knowledge representation language, that is machine computable. As an ontology should ideally be communicated to people and unambiguously interpreted by software, the informal representation helps the former and the formal the latter.

Figures 1 and 2 represents a skeletal methodology and life-cycle for building ontologies, inspired by the software engineering V-process model [26]. The left side of the V charts the processes in building an ontology and the right side charts the guidelines, principles and evaluation used to `quality assure' the ontology. The overall process, however, moves through a life-cycle, as depicted in Figure 2.

**Figure 1:** The V-model inspired methodology for building ontologies.
$\includegraphics[width=4in]{v-model.ps}$

**Figure 2:** The ontology building life-cycle.
$\includegraphics[width=4in]{life-cycle.ps}$

The stages in the V-process model and life-cycle are:

Identify purpose and scope:: developing a requirements specification for the ontology by identifying the intended scope and purpose of the ontology. A well-characterised requirements specification is important to the design, evaluation and re-use of an ontology. It can be seen from Section 4 that the use to which an ontology is put has a great effect on the content and style of that ontology.
Knowledge Acquisition:: the process of acquiring domain knowledge from which the ontology will be built. Sources span the complete range of knowledge holders: Specialist biologists; database metadata; standard text books; research papers and other ontologies. Motivating scenarios are collected and informal competency questions formed [21] - these are informal questions that the ontology must be able to answer and will be used to check that the ontology is fit for purpose. The EcoCyc and RiboWeb ontologies had the bulk of their knowledge gathered from the research literature on E. coli. metabolism and ribosomal structure respectively. In the former case this was a huge volume of material, which took many years to process. The TaO, being built to query databases, extracted a large part of its knowledge from database documentation. Standard texts also contributed to the knowledge of core molecular biology.
Conceptualisation:: identifying the key concepts that exist in the domain, their properties and the relationships that hold between them; identifying natural language terms to refer to such concepts, relations and attributes; and structuring domain knowledge into explicit conceptual models. This is the process touched upon in Section 2, where the concepts and relationships describing the domain are captured. The ontology is usually described using some informal terminology. Gruber [6] suggests writing lists of the concepts to be contained within the ontology and exploring other ontologies to re-use all or part of their conceptualisations and terminologies. At this stage it is important to bear the results of the first step, that of requirements gathering, in mind.
Integrating:: use or specialise an existing ontology: a task frequently hindered by the inadequate documentation of existing ontologies, notably their implicit assumptions. Using a generic ontology, such as MBO, or [27,28] gives a deeper definition of the concepts in the chosen domain.
Encoding:: representing the conceptualisation in some formal language, e.g. frames, object models or logic. This includes the creation of formal competency questions in terms of the terminological specification language chosen (usually first order logic). The representation of ontologies is explored further below.
Documentation:: informal and formal complete definitions, assumptions and examples are essential to promote the appropriate use and re-use of an ontology. Documentation is important for defining, more expansively than is possible within the ontology, the exact meaning of terms within the ontology.
Evaluation:: determining the appropriateness of an ontology for its intended application. Evaluation is done pragmatically, by assessing the competency of the ontology to satisfy the requirements of its application, including determining the consistency, completeness and conciseness of an ontology [25]. Conciseness implies an absence of redundancy in the definitions of an ontology and an appropriate granularity. For example, an ontology that modelled protein molecules at the atomic resolution when the amino acid level would suffice would not be considered concise.

Next: Knowledge representation languages Up: Building an Ontology Previous: Building an Ontology

Robert Stevens 2001-07-19