2007 GSA Denver Annual Meeting (28–31 October 2007)

Paper No. 6
Presentation Time: 3:00 PM

TOWARDS A GENERIC FRAMEWORK FOR SEMANTICALLY-ENABLED GEOSCIENCES DATA INTEGRATION


LIN, Kai, San Diego Supercomputer Center, San Diego, CA 92037 and BARU, Chaitanya, San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0505, baru@sdsc.edu

Many science questions in the Geosciences often require the integration of multiple heterogeneous distributed data sets. The field is characterized by many data collections, many of which are regional in nature--assembled by individual PI projects, or state, local, and federal agencies—with a wide variety of metadata conventions. While data sets are increasingly being made available online, for example, via Web-based protocols such as the Web Mapping Service and Web Feature Service, integrating and using these data sets together can still be a significant challenge. This is primarily due to the lack of standards for the schema and, especially, semantics associated with data sets. Given the widely dispersed nature of the community and the rapid pace of data acquisition, in many cases, it may just not be practical to develop and enforce a uniform standard for the semantics of the data. While transformation approaches can be employed to adequately handle the syntactic and structural heterogeneities, such approaches will not be able to resolve semantic differences. The use of ontologies is considered a promising solution for the semantic heterogeneity problem. In the GEON project (www.geongrid.org) we have developed a generic system that allows the “registration” of data to so-called “mediation ontologies” to assist in resolving semantic heterogeneities. Subsequently, users are able to “seamlessly” query the different data sets in a uniform fashion through these ontologies. To answer database queries to semantically registered data sets, distributed query plans are constructed by reasoning within the ontologies. One of our test cases is the integration of gravity data with geologic datasets. The gravity database is registered to the concept Gravitational Force by specifying latitudes, longitudes, values and a unit. Several state level geologic maps are semantically registered to the concept of Geological Unit, which is related to the concept of Geologic Age. Users are then able to retrieve all gravity points in geologic units satisfying certain conditions. This semantic data integration system has been implemented as a part of the GEON Portal (portal.geongrid.org).