Home Page | Table of Contents | Previous Page | Next Page


Vocabulary, Coding, and Concept Representation

Robert R. Hausam, MD / robert.hausam@m.cc.utah.edu
University of Utah

Florida Family Physician
January 1996 / Volume 46 / Number 1
Extended On-line Article

Imagine your office in the year 2000. You have just left the exam room after seeing your patient. You begin documentation of the visit by dictating directly into the voice recognition processor of your computerized patient record (CPR) system. Your words are immediately transcribed and presented to you for review. At the same time, the system your words are analyzed for their medical meaning, and a coded summary of the visit data is prepared. The system is unable to understand the meaning of a portion of the dictation, so it prompts you for additional clarifying information. You then quickly search for and select the appropriate diagnoses and assessments. Automatically, the problem list is updated and the billing diagnosis codes are generated.

You are concerned about your patient's new problems and decide to do a literature search for the latest treatment recommendations. With a button click, the data that you have already entered is analyzed and a Medline query is generated. In a few seconds you are presented with several pertinent articles for review. You then determine the indicated tests and order them through the CPR system, specifying that you should be automatically paged when the reports are available.

You review your patient's medications and decide that a new medication is indicated. The expert system uses the data already entered to recommend several possibilities. Any significant drug-drug or drug-disease interactions are automatically taken into account. After making some minor adjustments to the suggested dosage, you make the appropriate selection. The new prescription is automatically transmitted to your patient's preferred pharmacy.

The visit is now complete. The coded visit summary is checked against established practice guidelines to ensure that accepted standards are met. Your patient is now ready to check out. The CPR system recommends an appropriate level of service procedure code for billing, which you accept. All of the billing information is immediately available at the front desk. Your patient pays the copay and then goes to the pharmacy to pick up the new prescription.

Does this sound futuristic? I believe it does for most of us. In reality, though, nearly all of the technology described in this scenario is available for use today. So, why aren't we using it? The answer is complex, but one piece that generally has been missing is something to "glue" all of the different components of the system together. That "glue" is a standardized clinical vocabulary (also known as a controlled medical terminology, or several other variations of the name). A standardized clinical vocabulary provides a means of accurately and reliably communicating medical information between clinicians and also between the clinician and the computer. This capability is essential to the achieving the vision of integrated information exchange both within and external to the CPR system, as in the scenario described above.1

A standardized vocabulary is a collection of terms (or clinical phrases) which are used to represent specific medical meanings, known as concepts. These vocabularies usually include synonyms, which allow different terms to be used to represent the same concept, in order to allow individual clinicians to express their meanings in the way that is most natural to them. However, the lists of terms are fixed, and may not include every possible way that a clinician might wish to describe a particular concept. Also, the vocabularies are generally designed to be comprehensive, but cannot include every possible medical meaning that exists. Acceptance of these limitations, and development of means of working around them when necessary, are part of the price that must be paid to gain the benefits of standardization.

Closely related to standardized vocabularies are coding systems and medical concept representation systems. A coding system represents a particular concept (and in some systems, a particular term) by a specific identifier, which is usually a combination of characters (letters and/or numbers) which have no meaning in themselves. This coded representation is then used in place of the natural language description of the concept for further computer or human processing. Standardized clinical vocabularies generally include a coding system, as well.

Medical concept representation systems can be viewed as extensions of standardized vocabularies which provide a structure that allows a more complex representation of a concept.2 Structures are provided for dealing with modifiers (or attributes), which usually include such aspects as location, uncertainty, time, numerical values, and usually many others. In addition, structures are provided for relating one concept to another, through semantic links. Examples include relations such as "caused by," "located in," and others. A number of standardized clinical vocabularies currently exist. Two in particular that are receiving considerable attention as possible standards for use in CPR systems are SNOMED (Systematized Nomenclature of Human and Veterinary Medicine) International, from the College of American Pathologists and the American Veterinary Medical Association, and the Read Codes, from the National Health Service Center for Coding and Classification (NHS CCC) in the United Kingdom.3 Both are quite comprehensive, with the latest versions of each containing over 100,000 concepts. They both attempt to cover essentially the entire range of medical practice. SNOMED is richer in its coverage of pathology and chemicals and, of course, veterinary terms. It is beginning to see some significant use in pathology, and a lesser amount so far in other fields of medicine. The Read Codes were originally developed by Dr. James Read, a general practitioner in the UK. The earlier versions are widely used in UK general practice, with an estimated 90% of practices currently using computers and Read codes for clinical functions, primarily medical records and reporting to the NHS. The new version 3 is still under development, and includes many new terms from hospital-based specialists and allied health personnel, along with a major revision and enhancement of the underlying vocabulary structure. Both SNOMED and Read incorporate a coding system for concepts, and Read also uses one for individual terms. SNOMED includes a rich set of modifiers, but no explicit rules for how they are to be used. Read includes some simple concept representation capabilities in its "information model" for qualifiers (modifiers).

The Unified Medical Language System (UMLS) project from the National Library of Medicine is closely related to the standardized vocabularies, but is designed for a somewhat different purpose.4 The system is based on a Metathesaurus, which provides a mapping between the terms and concepts of its various source vocabularies in order to facilitate information exchange between different systems. A major focus is on support for bibliographic retrieval via Medline and similar databases, but it is also being considered for other more clinically related applications.

Undoubtedly the most widely known coding systems in the US are ICD-9-CM (International Classification of Diseases, 9th Revision, Clinical Modification), used for billing diagnoses, and CPT (Current Procedural Terminology), used for procedure coding. These systems actually serve quite well for their intended purposes. They were never intended to serve as a comprehensive vocabulary for representing concepts in a CPR, but may be very useful as adjuncts for linkage to billing systems. A similar system, ICPC (International Classification of Primary Care), from the World Health Organization, is quite good for statistical classification and reporting of diseases, but is generally not felt to be comprehensive enough to serve as a standard CPR vocabulary.

There are several projects underway to develop robust and comprehensive medical concept representation systems. A group at Stanford University, and now Kaiser Permanente, is working to extend SNOMED by using knowledge representation techniques such as conceptual graphs and other knowledge representation tools.5 A group at the University of Utah is developing the Event Definition Model.6 Both of these groups, as well as others in the US, are involved in the collaborative efforts of the CANON Group.2 A multi-center effort from the Advanced Informatics in Medicine program of the European Union is the GALEN (Generalised Architecture for Languages, Encyclopaedias and Nomenclatures in Medicine) project.7 All of these efforts, and others not mentioned, are currently in research and development. None is in significant clinical use at present, but all show potential for significant contributions in the future.

If we are to move forward and begin to develop and use systems that can support the types of functions described in the initial scenario, then one or more of these vocabulary, coding or concept representation systems will need to be adopted as a standard. This is a difficult, and often rather emotionally charged, process. One step in this direction was taken at the recent meeting Moving Toward International Standards in Primary Care Informatics: Clinical Vocabulary.8 This meeting discussed primarily the UMLS, SNOMED, Read and ICPC vocabularies. An effort was made to move toward consensus regarding their current status and recommendations for the future. It is planned to continue the work begun at this meeting over the next several months, using electronic mail for communication.

Through the use of standardized clinical vocabularies and medical concept representation systems, as well as other necessary technology, we should be able to begin to build the type of future envisioned in the initial scenario. When that occurs, we should expect significant improvements in our ability to obtain and utilize the clinical information needed for patient care, and, ultimately, significant benefits for our patients in the quality, and likely also the cost, of their healthcare.

References

  1. Dick RS, Steen E. The computer-based patient record: an essential technology for health care. Washington (DC): National Academy Press, 1991.
  2. Evans DA, Cimino JJ, Hersh WR, Huff SM, Bell DS. Toward a medical-concept representation language. J Am Med Informatics Assoc. 1994;1:207-17.
  3. Vocabularies for computer-based patient records: identifying candidates for large scale testing (minutes of meeting 5-6 Dec 1994). Washington, (DC): National Library of Medicine and AHCPR, 1994.
  4. Tuttle MS, Blois MS, Erlbaum MS, Nelson SJ, Sherertz DD. Toward a bio-medical thesaurus: building the foundation of the UMLS. In: Greenes RA, ed. Proceedings of the Twelfth Annual Symposium on Computer Applications in Medical Care, November 6-9, 1988, Washington, DC. Silver Spring, MD: IEEE Computer Society Press, 1988:191-5.
  5. Campbell KE, Das AK, Musen MA. A logical foundation for representation of clinical data. J Am Med Informatics Assoc. 1994;1:218-32.
  6. Huff SM, Rocha RA, Bray BE, Warner HR, Haug PJ. An event model of medical information representation. J Am Med Informatics Assoc. 1995;2:116-34.
  7. Rector AL, Nowlan WA, Glowinski A. Goals for concept representation in the GALEN project. In: Safran C, ed. Proceedings of the Seventeenth Annual Symposium on Computer Applications in Medical Care, November 1-3, 1993, Washington, DC. New York: McGraw-Hill, 1994:414-8.
  8. Proceedings of Moving Toward International Standards in Primary Care Informatics: Clinical Vocabulary, November 1-2, 1995 (in press).

Robert R. Hausam, M.D. VA Ambulatory Care Fellow (Medical Informatics) University of Utah

email: robert.hausam@m.cc.utah.edu

mail: Department of Medical Informatics

phone: (801)588-5042 (VA IRMFO) (voice mail accepted)


Edited on December 10, 1995 / Updated on December 10, 1995
Location: http://www.med.ufl.edu/medinfo/ffp/coding.html

Contact: Richard Rathe, MD / rrathe@ufl.edu

Home Page | Table of Contents | Previous Page | Next Page