Ontology Server (os-qis.med.yale.edu) Query Integrator System (QIS)
Skip Navigation LinksHome Anonymous->   Login

Introduction

This web site is dedicated to the Ontology Server(OS) component of the Query Integration System (QIS), a prototype application framework for biomedical database federation developed at the Center for Medical Informatics to test distributed database integration involving heterogeneous data sources in bioscience.

OS provides the following functionality:

  • Mapping of either data or metadata elements within a data source to concepts in a controlled vocabulary.
  • By tagging these elements with controlled vocabulary concept IDs, the task of integration between different data sources in distributed systems is facilitated. This use of controlled vocabularies is well known to system integrators in medical informatics who have to interchange data between systems; the benefits are fairly obvious and do not need elaboration.
  • QIS will let you query our databases by searchiing the UMLS for a particular concept, and then letting you know if this concept exists within our own database. The neuroscience-related details of the object/s that map to this concept (such as the higher-order anatomical structures that they are contained in, or

What is an Ontology?

While the terms controlled vocabulary, thesaurus and ontology originally meant different things (each succeeding concept is a more robust and elaborate version of the previous one), today one tends to use these phrases synonymously. Specifically, an ontology consists of three classes of entity: Concepts, terms (or alternative names for these concepts), and relationships between these concepts. The most common kind of relationship is the "is-a" relationship (e.g., acetylcholine is-a neurotransmitter) - also called hyponymy, other relationships also exist, such as "part-of" (meronymy) - e.g., the finger is part-of the hand.

Scope of our work

The reference ontology/controlled vocabulary that we use is the National Library of Medicine's Unified Medical Language System (UMLS). The UMLS is a meta-thesaurus - that is, a compendium of other biomedical vocabularies. Some vocabularies within the UMS are much more useful than others (in fact, the whole is not greater than the sum of its parts - due to indifferent curation of some vocabularies, you can actually get conflicting results if you try to make use of all the inter-concept relationships), and therefore we use a subset of it: the Medical Subject Headings (MeSH), Gene Ontology, SNOMED and BrainInfo.

We should emphasize that only about a third of the total number of entries in SenseLab have been mapped to UMLS concepts. The UMLS is under-represented with respect to neuroscience, and only gross neuroanatomy is well represented. You will search in vain, for example, for "olfactory glomerular cells", which are involved in detection of odors (they lie within the cerebral olfactory bulb) and which happen to be an important focus of research in the Shepherd Lab at Yale.

We emphasize that development of a "local" ontology is currently minimal. We do record that certain classes of data SHOULD be concepts in the UMLS, even if all instances of that class are not currently concepts. Such instances, which are not yet mapped, are called CANDIDATE concepts. It is not enough to merely record candidate concepts in a local list: one must also specify relationships between unmapped concepts to each other, and to existing UMLS concepts, in order both to make the local set useful, as well as to eventually volunteer these to NLM for future inclusion in UMLS. Specifying such relationships enables a graphical browser to navigate relationships so that the user can visualize local concepts in their correct context.

Assigning candidate concepts to their "correct" position in a relationships tree/network is a human-intensive process, and in general, local ontology development takes financial resources to support the curators; we do not have such resources earmarked. The efforts need to be considerable, because certain areas of neuroscience (for example, neuronal compartments or the large vocabulary of neuronal modelers) are almost non-existent in the UMLS and here, one has to start from scratch.

Why use ontologies?

In SenseLab, access to the UMLS serves two purposes:

  • Synonymy: The vocabulary of biomedicine is rife with synonyms: "liver" and "hepatic", "kidney" and "renal", "vomiting" and "emesis" are Anglo-Saxon and Graeco-Latin equivalents for the same thing. Keyword-based search is likely to miss the records of interest unless you take the trouble to manually specify every alternative spelling of what you are looking for, because you don't know a priori how it might be recorded in the database. Neuroscience isn't so bad, but even here, alternative names have arisen: Serotonin has the alternative name 5-hydroxy-tryptamine (and the abbreviation, 5-HT), while norepinephrine is also called noradrenaline, and the midbrain is also the mesencephalon.

    The nice thing about UMLS is that the curators of the vocabularies that contribute to it (as well as NLM personnel) have taken pains to specify alternative synonyms for a concept, so if an object in SenseLab has already been mapped to a UMLS concept, then you can locate it as follows. You type in a phrase, or part of a phrase, locate all UMLS terms (and thence) concepts that contain that phrase. After you pick the one you want, you can directly jump to that concept.

  • Location Transparency: If you've browsed SenseLab already, then you know that it is organized into virtual databases or portals. That is, everything is stored in one big physical database but, in order to cater to various types of user who are more interested in some classes of data than in others (e.g., neuronal modelers versus the miicro-anatomists versus olfactory-receptor-sequencers), the user interface segregates related classes so that a particular family of data is more directly accessible to a particular type of user. (Some classes of data - such as neurotransmitter molecules and receptors - show up in multiple portals because they are so fundamental to all of neuroscience.)

    Segregating data like this is all very well for regular users who know exactly where something of interest to them is likely to be, but casual users who are entering SenseLab for the first time often simply want to know about something without having to figure out which portal it lies in. Concept-based search works across all objects in SenseLab, as well as the remote databases that it links to, such as the University of Washington, Seattle's BrainInfo)- if any concept of interest happens to have been mapped to an object, you are taken directly to it, where it is displayed, typically in the context of other related information in its portal.

    (Caveat: this short-cut navigation feature does not work for the University of Southern California's Brain Architecture Management System (BAMS) . Unlike databases such as NCBI's Entrez (or, for that matter, SenseLab) BAMS currently does not provide a means of displaying all information about an object through a Web request that specifies that object's unique identifier.)

  • Inter-Operation: Consider the scenario where multiple databases maintained by separate curator teams are required to inter-operate. (At a crude level, inter-operation simply means locating an object of interest in one database, getting some information on it, and then jumping to the same object in another database to get some different information on it.) If no reference ontology existed, teams of curators would have to meet in pairwise fashion with lists of their objects and their accompanying descriptions, and spend days to weeks specifying  correspondences between internal object identifiers between both databases. If there were N databases, the number of meetings required would be N * (N-1) / 2.

    If, however, a reference ontology was used, such meetings would not even be necessary. Each group could autonomously map objects in their own database to concepts in the reference ontology. The latter serves as a lingua franca, and one can use the reference-ontology concept IDs as a bridge between these two databases for the mapped concepts, irrespective of what they are called internally (the two databases may even be in different languages, e.g., English vs. German). Doug Bowden's BrainInfo group at Seattle and our own group use the UMLS this way to map concepts in gross neuroanatomy. (BrainInfo also happens to be part of UMLS).

Caveats when searching the UMLS: Ambiguous Terms

Certain words can refer to multiple concepts: for example, the term "serotonin" can refer to the molecule, a neurotransmitter, but also to the receptor for the same molecule (where the word "receptor" is likely to be omitted or elided - e.g., in a table of receptors, the concept of receptor is implicit). Fortunately, in most cases, the "preferred name" of the concept in the UMLS is informative.

Tutorial

This mini-tutorial lets you search the UMLS to locate a concept of interest, and then jump to the object mapped to this concept within SenseLab.

  • Go to the Ontology Server, http://os-qis.med.yale.edu (Open a new browser window by shift-clicking on the link, so that this window is still available to you.)

  • Choose Tools -> UMLS Search

  • In the pull-down list for "Search For", specify the choice "Term Name"

  • In the pull-down below this list, specify "Starting with", and type "retinal gang"

  • You will be shown two rows: "retinal ganglion" and  "retinal ganglion cells". Click the link (>>) against the second row.

  • A new screen opens, containing the definition of the matching concept in UMLS. (The prose description that you see is taken from the NLM's Medical Subject Headings definitions.) Below this definition, you will see two rows, indicating that this concept is mapped to objects in the BAMS and SenseLab databases. The SenseLab entry has the hyperlink indicated by "o270". Clicking this link will take you to the page within SenseLab that describes this object. (e.g., you will see a diagram of some retinal neurons, as well as links that take you to other details, such as receptors associated with this type of neuron).

 

To start, select the "Domains" menu item in the top toolbar to get the list of the Ontological communities in this OS.

 

This site is Copyright 2008, Yale Center for Medical Informatics
Yale University