Geoinformatics 2007 Conference (17–18 May 2007)

Paper No. 17
Presentation Time: 2:30 PM-4:30 PM

ONTOLOGIC INTEGRATION OF GEOSCIENCE DATA ON THE SEMANTIC WEB


MALIK, Zaki, Computer Sciences, Virginia Tech, Blacksburg, VA 24061, REZGUI, Abdelmounaam, Computer Science, Virginia Tech, 205 Knowledge Works II-CRC, Blacksburg, VA 24060 and SINHA, A. Krishna, Geosciences, Virginia Tech, Blacksburg, VA 24061, zaki@vt.edu

Introduction

The World Wide Web was originally created for data sharing among scientists. Over the years, the Web has evolved from being merely a repository of data to a vibrant environment that allows users to make their data and applications Web accessible. As a result, a wealth of information and applications are now available, and their quantitative and qualitative management has become a primary issue. For instance, several research initiatives by geoscientists over the years have produced large amounts of data. However, the ability to find, access, and properly interpret these large data repositories has been very limited. Two main reasons for this lack of data sharing are the adoption of personal acronyms, notations, conventions, units, etc. by each research group when producing data, and the current Web search methods that can be understood only by humans or custom developed applications (Medjahed et al 2003). Moreover, currently machines merely display the data, and they are unable to process it any further due to lack of data and application semantics. This makes it difficult for other scientists to correctly understand the semantics of the data, and makes the automatic interpretation and integration of data simply infeasible. We suggest that for enabling the sharing, understanding, and integration of geosciences data and applications on a global scale, ontology-based registration and discovery is required.

  Ontologies and the Semantic Web

The emerging Semantic Web is defined as an extension of the existing Web, in which information is given a well-defined meaning (Berners-Lee et al 2001). The ultimate goal of the envisioned Semantic Web is to transform the Web into a medium through which data and applications can be automatically understood and processed. The concept of Web services (and other related technologies) is seen as a key-enabler of the Semantic Web (Alonso et al 2003, McIlraith et al 2001). A Web service is a set of related functionalities that can be programmatically accessed through the Web. The convergence of business and government activities in developing Web service related technologies (as SOAP, UDDI, and WSDL) is a sign for large adoption of Web services in the near future (Curbera et al 2002).  Another key player in the envisioned Semantic Web is the concept of ontology. An ontology may be defined as a set of knowledge terms, including the vocabulary, the semantic interconnections, and some simple rules of inference and logic for some particular topic.  The Semantic Web is expected to offer data, organized through ontologies, and applications, exposed as Web services, enabling their understanding, sharing, and invocation by automated tools.

  Data Ontologies

Recognizing the potential of the semantic Web, we have defined a “planetary ontology,” (through many workshops and scientific meetings) to provide the ontologic framework for Earth science data at many levels of semantic granularity. The planetary ontology includes concepts, concept taxonomies, relationships between concepts, and properties that describe concepts, as an initial step towards the development of ontologies for earth science. Figure 1 shows the high-level representation of the planetary ontology.  High-level packages such a Planetary Material can be used to represent the nature (physical, chemical) and state of substances and their properties. This figure also emphasizes the utilization of imported and inherited properties from additional packages, e.g. Physical Properties, Location, and Planetary Structure, to fully define the concept of Planetary Materials. Ontologies will support the semantic Web through (1) ease of registration to facilitate discovery, and (2) ability to query across multiple and diverse databases through interconnected disciplinary ontologies.

Figure 1.  The planetary ontology

We have developed a prototype system as proof of concept, which uses the planetary ontology in discovering solutions to complex geoscience questions. Some of the geoscience tools that are required for integration within the web environment are registered as Web services. For example, “data filtering tools” that distinguish between geologic bodies based on Magma Class (e.g. A-Type, S-Type, M-Type, or I-Type) or metamorphic facies assemblages have been wrapped and registered as Web services. This enables users to utilize the tools without detailed knowledge of the operating system, development language environment or the component model used to create these geoscience tools. Since only the input and output parameters need to be defined for Web service-based applications, it encourages reusability and reduces development time, effort and cost.

  Service Ontologies

As the Semantic Web matures, and more geoscientists adopt this paradigm, it is expected that a number of geoscience tools and services will be made accessible as Web services. This would require that similar to data management practices, Web services be also ontologically registered. Annotating Web services with semantics would ensure that appropriate tools (in form of Web services) are selected in an efficient and automatic manner for answering geoscience queries. Domain experts would provide formal specifications of geoscience concepts, enabling automated Web service usage. Moreover, since the Semantic Web is geared towards interactions involving minimal human-intervention, a service ontology would enable direct service-to-service communication and facilitate information transfer.

Figure 2. Semantic Web enabled geoscience querying engine

To fully understand the need for a service ontology, consider the geoscience query: “Find the chemical composition of a liquid derived by 30% partial melting (PM) based on the average abundances of Rare Earth Elements (REE) of A-Type plutons in Virginia.” This query clearly requires access to a number of data sets and geoscience tools.  Figure 2 provides a high-level overview of the four steps involved in answering the query. These are: finding the A-Type bodies in VA., computing the averages, using the REE definitions contained in the element ontology and exporting the data to a PM tool for computation, and displaying the results. The prototype query engine (DIA, Discovery, Integration and Analysis engine) developed by us is able to address the query.

The discovery of data pertaining to A-Type bodies (a class of igneous rocks) and which contains elemental data classified as REE requires that the geoscience data be registered to ontologies. The data ontologies now available to geoscientists (Sinha et al) allow access to multiple disciplines to fulfill this requirement (see Figure 2). Another major requirement for answering the query lies in the discovery of appropriate tools to carry out data filtering through both mathematical and domain-specific computations. It is expected that geoscientists will develop similar Web services (tools), using their own acronyms, and advertise them across multiple service registries (UDDI). Discovery of the required tools is only possible if the available tools and services have defined and precise semantics associated with them. Thus, similar to data ontologies, “service ontologies” will also be required. Service ontologies are used for two purposes: to register services and to discover services. An ontology-based service description provides meta-data information about the service provider, as the service's categories and sub-categories, the service's address, its parameters and their types, the service's output, the service's cost, etc. Service registries expose ontology-based search interfaces that service clients use to discover services appropriate for a given task. Once the client selects a given service, the registry provides the service description that the client then uses to actually invoke the service. Therefore, a service ontology will do for Web services, what a data ontology has done for geoscience data.

  <>Acknowledgement

This research is supported by the National Science Foundation award EAR 0225588 to A. K. Sinha.

  <>References Cited

Medjahed, B., Bouguettaya, A., and Elmagarmid, A., 2003, Composing Web Services on the Semantic Web, VLDB Journal, 12(4): 333-351.

Berners-Lee, T., Hendler, J., and Lassila, O., 2001, Semantic Web, Scientific American, 284(5): 34-43.

Alonso, G., Casati, F., Kuno, H., and Machiraju, V., 2003, Web Services: Concepts, Architecture, and Applications, Springer Verlag (ISBN: 3540440089).

McIlraith, S., Son, C., and Zeng, H., 2001, Semantic Web Services, IEEE Intelligent Systems, 16(2): 46-53.

Curbera, F., Duftler, M., Khalaf, R., and Nagy, W., 2002, Unraveling the Web Services Web, IEEE Internet Computing, 6(2):86-93.

Sinha, A. K., Malik, Z., Rezgui, A., and Dalton, A., 2006, Developing the Ontologic Framework and Tools for the Discovery and Integration of Earth Science Data-Cyberinfrastructure Research at Virginia Tech, Annual Report.