Geoinformatics 2007 Conference (17–18 May 2007)

Paper No. 16
Presentation Time: 2:30 PM-4:30 PM

COMMUNITY SCIENCE ONTOLOGY DEVELOPMENT


RASKIN, Robert, Jet Propulsion Laboratory, NASA, Jet Propulsion Laboratory, Pasadena, CA 91109, FOX, Peter, HAO, NCAR, HAO/NCAR, P.O. Box 3000, Boulder, CO 80307, MCGUINNESS, Deborah L., Knowledge Systems, McGuinness Associates and Stanford University, 20 Peter Coutts Circle, Stanford, CA 94305 and SINHA, Krishna, Virginia Polytechnic Inst & State Univ, 4044 Derring Hall, Blacksburg, VA 24061-0420, raskin@jpl.nasa.gov

Background of the need for formal semantic encodings

Scientists often spend a majority of their time locating and preparing a dataset before it even can be analyzed and put to use.  Given that 21st century science will be blessed with massively large amounts of data, many of which can be used together synergistically, a great opportunity is not being realized.  Knowing only the syntactic description of a dataset does not go far enough to remedy this situation.  Formal semantic encoding potentially enables the use of automated data integration, smart search, and interdisciplinary data fusion.

Best Practices and Modular Ontologies

            Our experiences suggest that ontologies should be modular.  Most aspects of science deal with hierarchical specializations of concepts, such as new classes of rock types and subtypes. Hence, inheritance is very important. Ideally, concepts in an Earth science ontology should also inherit from more general ontologies in Physics, Chemistry, Math, Space, Time, etc.

Project and Agency Interest

NASA is moving from an instrument-based to measurement-based strategy in its archival systems.  This approach implies that full lineage of a dataset must be preserved to enable cross-platform integration.  NSF has initiated calls for ontology development through its Office of Cyberinfrastructure.

Community needs are diverse, but share many common elements, such as the desire to read standard data formats and associate parameter names with meaningful scientific concepts.  At these early stages of ontology development, funded work in one community should build upon the work of others, rather than be reinvented and not reused.

Current ontology efforts

            Current ontology development work includes:

Semantic Web for Earth and Environmental Terminology (SWEET), an upper-level ontology developed at NASA/JPL with coverage of the entire Earth system (Raskin, 2006; Raskin and Pan, 2005).  See Figure 1 for the ontology structure.  <http://sweet.jpl.nasa.gov>

Virtual Solar Terrestrial Observatory (VSTO), developed at NCAR with McGuinness Associates with coverage of solar atmospheric physics and terrestrial middle and upper atmospheric physics [McGuinness, et. al, 2007] <http://vsto.org>

Marine Metadata Initiative (MMI), developed at MBARI with coverage of instrumentation and the marine world. See Figure 2 for an excerpt.  <http://marinemetadata.org>

Geosciences Network (GEON), developed at UCSD with coverage of the solid Earth  See Figure 3 for the ontology structure. <http://geongrid.org>

Ontology tools for collaboration of communities

There has been a dire need for tools to support ontology evolution, validation, reasoning, comparison, merging, etc.  This is an emerging area of work for a number of organizations.  MMI, for example, has stepped in and created a suite of tools for creating, comparing, and harmonizing ontologies with the goal of supporting ontologies for marine science, and science in general.   <MMI tools: http://marinemetadata.org/examples/mmihostedwork/ontologieswork>.   MMI convenes workshops where teams generate new ontologies, such as for instrumentation.

 

www.planetont.org

The http://www.PlanetOnt.org web site is a collaborative community to share ontologies and infuse the experience of others.  It provides services for: ontology version registration, comparison of ontologies, imported class dependencies, RSS feeds to notify dependent ontology owners of potential changes made, and discussion regarding an ontology or specific elements within an ontology.  It provides a forum for identifying best practices, getting around specific limitations in OWL in a consistent manner.  It is open to community involvement and welcomes submissions.

 

Discussion and Conclusion

Ontologies should be developed collaboratively and incrementally from existing work.  Ontology normalization shares much in common with database normalization.  However, there is a further need to distinguish the general concept from the more specific.  Specialized ontologies should import from the more general ones rather than repeat the general concepts.  The www.PlanetOnt.org site supports this directional structure by identifying dependencies between any pair of ontologies..

References Cited

Deborah McGuinness, Peter Fox, Luca Cinquini, Patrick West, Jose Garcia, James L. Benedict, and Don Middleton. The Virtual Solar-Terrestrial Observatory: A Deployed Semantic Web Application Case Study for Scientific Research. In the proceedings of the Nineteenth Conference on Innovative Applications of Artificial Intelligence (IAAI-07). Vancouver, British Columbia, Canada, July 22-26, 2007.

Raskin, R. G., Ontologies for Earth system science, in Geoinformatics: From Data to Knowledge, Geological Society of America, 2006.

Raskin, R. G. and M. J. Pan, Knowledge representation in the Semantic Web for Earth and Environmental Terminology (SWEET), Computers and Geosciences, 31, 1119-1125, 2005.

Figure 1.   Figure caption – SWEET conceptual decomposition

Figure 2. Figure caption – VSTO ontology schematic

Figure 3. Figure caption – GEON ontology framework