2004 Denver Annual Meeting (November 7–10, 2004)

Paper No. 2
Presentation Time: 1:45 PM

AN ONTOLOGY FOR INTEGRATING STRATIGRAPHIC DATABASES


GREER, Douglas S., San Diego Supercomputer Center, Univ of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0505, LUDAESCHER, Bertram, San Diego Supercomputer Center, Univ of California, San Diego, 9500 Gilman Drive, Mail Code 0505, La Jolla, CA 92093-0505, FILS, Douglas, Geological & Atmospheric Sciences, Iowa State Univ, 253 Science I, Ames, IA 50011 and BARU, Chaitan, San Diego Supercomputer Center, Univ of California, San Diego, La Jolla, CA 92093-0505, dsg@sdsc.edu

One way to establish connections between different database schemata is through an ontology. Related data items from different source schemata can be co-located by registering them with an ontology, describing typed relationships between the stored data objects. For relational databases, a method of semantic registration is to associate with a database attribute (represented as a column) additional semantic information such as a natural language description or a scientific article defining the specific attribute. We employ W3C standards, in particular the Resource Description Framework (RDF) and the Ontology Web Language (OWL) for representing ontologies.

This work is done as part of the Chronos project whose main objective is to develop a network of databases, tools and analytical methodologies that broadly deal with chronostratigraphy. Part of this work involves federating several distinct, independently developed databases.

The representation consists of two parts, both of which are written using RDF/OWL. The first part is a high-level description that records "knowledge" about the database types. This knowledge is recorded in classes and their typed relations to other classes. The second part, which is automatically generated, contains information about the databases that is cross-referenced to the defined types. The tables and columns represent "instances" of the classes defined in the first part.

The RDF Schema specification contains constructs such as "rdfs:isDefinedBy", "rdfs:seeAlso", "rdfs:label" and "rdfs:comment" that can be used as a standard mechanism for binding database columns to references or English language definitions. These constructs thus act as a bridge between the machine readable data contained in a database and the human readable knowledge contained in the scientific literature.

If the database metadata has been precisely specified by the ontology references, then the necessary conversions and other operations are defined implicitly. Thus a program that can read and parse the both parts of the ontology can automatically perform the necessary manipulations of the data.