2003 Seattle Annual Meeting (November 2–5, 2003)

Paper No. 6
Presentation Time: 9:30 AM

THE ROLE OF XML IN MEDIATED DATA INTEGRATION SYSTEMS WITH EXAMPLES FROM GEOLOGICAL (MAP) DATA INTEROPERABILITY


BRODARIC, Boyan1, LUDAESCHER, Bertram2 and LIN, Kai2, (1)Geol Survey of Canada, Ottawa, ON K1A 0E9, (2)San Diego Supercomputer Center, Univ of California, San Diego, 9500 Gilman Drive, Mail Code 0505, La Jolla, CA 92093-0505, brodaric@NRCan.gc.ca

Growth in both geoscientific data volume and availability is causing analytical tasks to be preceded by a data integration procedure that reconciles the heterogeneous structure and content of multiple input data sources. Such reconciliation is classically described in four dimensions: reconciling systems (e.g. operating systems), syntax (e.g. data format); schematics (e.g. data structure) and semantics (e.g. data content). Though widely considered an important aid to data integration, XML mainly addresses the syntax dimension in the integration procedure by enabling the encoding of data structure and content. In mediated systems XML therefore typically serves as the vehicle for encoding and transporting data from local data structures and content to a standard 'global' structure and content (i.e. a global schema); queries and other database operations can then be posed against the global schema using a central mediator and source-specific wrappers that translate local data to the global schema. We illustrate the role of XML in mediated systems via a geological map interoperability prototype developed by the GEON project. The prototype enables querying for rock types and geologic ages across several heterogeneous state level geologic maps in the Rockies and western US. It uses a global schema consisting of a central data structure and classification schemes for rock types and geologic ages, and XML encoding to carry data between the local databases, wrappers and the mediator. Central to this work is the notion that schematic and semantic standards, or ontologies, precede syntactic encoding and are a priority.