Geoinformatics 2007 Conference (17–18 May 2007)

Paper No. 7
Presentation Time: 1:15 PM

SEMANTIC MEDIATOR ARCHITECTURE FOR ENVIRONMENTAL SCIENCE


BERMUDEZ, Luis E., Research and Development, Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss landing, CA 95039 and GRAYBEAL, John, Research and Development, Monterey Bay Aquarium Research Institute, 7700 Sandholdt Road, Moss Landing, CA 95039-9644, bermudez@mbari.org

Adopting metadata specifications is often insufficient to achieve interoperability among geospatial information communities due to the heterogeneity of the values in metadata annotations [2]. In geological sample databases, semantic heterogeneities could occur in the rock types, sample technique type, sampling platform, and analysis procedures, to name just a few. For example, when a sample is collected by hand by a diver, the PetDB [12] database calls it dive, while the SamplesDB data base at MBARI [10] uses the term hand collected.

The Marine Metadata Interoperability (MMI) project is working to address semantic conflicts. The work is guided by community collaborations and supported via the MMI site (http://marinemetadata.org). MMI focuses on several activities to achieve semantic interoperability: 1) encouraging reuse of existing vocabularies; 2) providing best practices for publishing controlled vocabularies so that they are interoperable in the Semantic Web; 3) hosting workshops to create and map controlled vocabularies; 4) providing tools and guidance to solve semantic heterogeneities.

Along these lines, MMI has developed an architectural concept and a prototype implementation of a semantic mediation service. In this paper, we present this architecture and implementation, and discuss its potential application to the geosciences.

The two basic solutions to solve semantic heterogeneities are the mediator/wrapper approach [7], and the enforced standard approach. The wrapper is a piece of software build on top of the data sources, which serves the metadata and data via a common model. Traditionally, queries in the central system are translated to queries in local systems. The enforced-standards approach requires that every data source provide the data and metadata according to a single standard, including using the same exact terms to specify semantic meaning. Clearinghouses that harvest metadata in a particular format are example of this approach [5].

The MMI proposition to solve semantic heterogeneities is a mixture of the two previous approaches. It requires metadata be made available in a standard format, yet allows the controlled vocabularies in the system to be heterogeneous. We propose that the semantic mediator is a reusable, sharable component of a service-oriented architecture (see Figure 1). A centralized mediator facilitates lookup services, registry of vocabularies, mappings and queries that other components of the system could use.

This mediator, which is available at http://marinemetadata.org/semor, is being used at OOSTethys [9], which is the SURA/MMI interoperability demonstration. The semantic mediation service is based on Semantic Web [3] technologies, such as OWL[1] and RDF[4] to store and retrieve controlled vocabularies represented in ontologies. It has a web user interface and a SOAP web service to interact with it programmatically. It is currently based on a combination of SESAME [11] and JENA [6], and support for SPARQL/SOAP [13] is planned in the near future. Mapping of vocabularies is performing via the VINE [8] tool, a stand-alone application specialized in creating custom mappings. Currently, the mappings and rules are loaded to the semantic mediator service manually, and the semantic mediator regenerates all the relationships (including inferred relationships) in the knowledge base every time a new ontology is added.

These semantic capabilities will prove increasingly useful in a wide range of environmental science applications, as ever more data systems are directly and indirectly linked to provide interoperable services. The MMI-developed semantic architectures and tools, along with many others presented on the MMI web site, are targeted to developing interoperable environmental data systems on a national and international scale.

References

[1] Bechhofer, S., Harmelen, F. v., Hendler, J., Horrocks, I., McGuinness, D. L., Patel-Schneider, P. and Stein, L. A., OWL Web Ontology Language Reference, W3C Recommendation, 2004, http://www.w3.org/TR/2004/REC-owl-ref-20040210/.

[2] Bermudez, L. E., ONTOMET: Ontology Metadata Framework, Drexel University, Philadelphia, Pennsylvania, 2004.

[3] Berners-Lee, T., Hendler, J. and Lassila, O., The Semantic Web, Scientific American, 184 (2001), 34-43.

[4] Brickley, D. and Guha, R. V., RDF Vocabulary Description Language 1.0: RDF Schema, W3C, 2004, http://www.w3.org/TR/2004/REC-rdf-schema-20040210/.

[5] Geospatial One-Stop, http://gos2.geodata.gov/wps/portal/gos.

[6] HP Labs Semantic Web Research, JENA, 2006, http://jena.sourceforge.net/.

[7] Kashyap, V. and Sheth, A., Information Brokering Across Heterogeneous Digital Data, Kluwer Academic Publishers, Norwell, MA, 2000.

[8] MMI, VINE, http://sf.net/projects/vine.

[9] MMI/SURA., OOSTethys, http://oostethys.org.

[10] Monterey Bay Aquarium Research Institute, Samples Database, http://www.mbari.org/samples/docs/database.htm.

[11] OpenRDF.org, Sesame, http://www.openrdf.org/.

[12] Petrological Database of the Ocean Floor, http://www.petdb.org/.

[13] World Wide Web Consortium, SPARQL Protocol for RDF, in K. Grant, ed., 2006, http://www.w3.org/TR/rdf-sparql-protocol/#query-bindings-soap.