2008 Geoinformatics Conference (11-13 June 2008)

Paper No. 4
Presentation Time: 3:40 PM-4:20 PM

DIF METADATA AND HANDLING AT THE GFZ ISDC


MENDE, Vivien, RITSCHEL, Bernd, FREIBERG, Sebastian, PALM, Hartmut and GERICKE, Lutz, Data Center, GFZ Potsdam, Telegrafenberg A3, Potsdam, 14473, Germany, vmende@gfz-potsdam.de

The Information System and Data Center (ISDC) is managing more than 11TB of geoscience data and information. By now, these data are coming from 11 missions[1] with nearly 300 product types, approximately 16 Mio products and more than 1700 users. This paper is giving a short overview about the developed and used metadata concept.

A product type of a geoscience mission or project consists of a set of products (data file(s) + metadata). This relation is shown exemplarily in figure 1 in respect to orbit products of the new satellite mission TerraSAR-X.

Figure 1: ISDC Relations


In order to describe and manage the data products we are using an evolution of the NASA's Directory Interchange Format (DIF) standard (version 9.x). The Global Change Master Directory (GCMD) defines metadata as: “Descriptive information that characterizes a set of quantitative and/or qualitative measurements and distinguishes that set from other similar measurement sets." (L. Olsen)

For the management of product types this format is excellent. The ISDC DIF conform base schema of the parent DIF XML documents is stipulated in the “base-dif.xsd” file. The ISDC XML schema has been defined on the basis of the GCMD XML schema definition: http://gcmd.nasa.gov/Aboutus/xml/dif/dif_v9.7.1.xsd. In order to describe single products, it was necessary to extend the DIF standard and to modify the GCMD XML schema. Even the structure of the ISDC DIF XML schema is different from the GCMD schema; the ISDC parent DIF XML documents are valid in relation to the GCMD schema. Additionally, the ISDC is using a parent – data child DIF combinationt. The metadata of product types are described in associated parent DIF XML files according to the base-dif.xsd schema. The parent DIF XML files are validated and stored in an Oracle XML database. The product (data file) specific metadata are documented in data child DIF XML files. Each product type has its own schema for the data child DIF XML files. So, data child DIF documents are used in order to describe the data file specific properties. The complex XML type <Data_Parameters> in the data child DIF XML document provides the specific extension of the parent DIF XML structures. This includes specific metadata of the product, like data filename, data file size, revision, satellite ID and others. In order to realize this data model, we are using the redefine XML technique for the definition of complex XML types for the <Data_Parameters>. By redefining the ISDC “base-dif.xsd” schema all data child DIF XML documents are derived. Using the GCMD XML schema, this approach would not be possible because of the definition of XML reference structures.

 

The extended metadata of the data child DIF XML documents are parsed by a perl script. If the data structure is right, the extended metadata are stored in product type related tables in a relational DB. The connection between the data child DIF XML files and the parent DIF XML document is given by the equality of parts of the <Entry_ID> element in both, the product type and the related product metadata documents. Additionally the content of the <Parent_DIF> element in the data child DIF XML document refers to the appropriate parent DIF document. The relation between the schemata, the XML metadata files and the storage structures is shown in figure 2.

Using the parent DIF XML structures it is possible to realize a thematic search concerning the content of the different product type documents as well as to provide interoperability to other catalogue systems. Now it is possible to transform the XML DIF files into ISO 19115 standard documents in order to use OGC compliant Web Services like the “deegree CWS 2.0”. Furthermore an efficient harmonisation with other catalogue systems can be realized by international standards. The structure of XML easily allows extending the DIF standard in future. Using the parent-child DIF concept, only a small amount of mandatory metadata must be put both in the parent and data child DIF XML documents.


Figure 2: Schema, DIF XML and Storage


References

[1] Braune, S., Czegka, W., Klump, J., Palm, H., Ritschel, B., Lochter, F. A. (2003): Anwendungen ISO-19115-konformer Metadaten in in Katalogsystemen aus dem Bereich umwelt- und geowissenschaftlicher Geofachdaten. - Zeitschrift für Geologische Wissenschaften, 31,1, 37-44

[2] Ritschel,B., Bruhns, C:, Kopischke R., Mende V., Palm H., Freiberg S., Gericke L.

The ISDC concept for long-term sustainability of geoscience data and information,

PV 2007 Conference, Symposiums-Proceeding, (Oberpfaffenhofen 2007)

[3] Voges, U., Senkler, K. (2005): OpenGIS® Catalogue Services Specification 2.0 - ISO19115/ISO19119 Application Profile for CSW 2.0.; OpenGIS Consortium, Wayland, Massachusetts

[4] Burgess, Ph., Palm, H., Ritschel, B., Bruhns, Ch., Freiberg, S., Gericke, L., Kase, St., Kopischke, R., Loos, St., Lowisch, St.,

Implementing modern data dissemination concepts in the ISDC Portal, Status Seminar Erkundung des Systems Erde aus dem Weltraum“' (Bonn 2006);

[5] Ritschel, B., Palm, H., Mende, V.,  Bruhns, Ch., Freiberg, S., Gericke, L., Kopischke, R., Pfeiffer, S.

GFZ ISDC - Portal to geoscientific data, information and knowledge, GeoInformatics (Potsdam 2008)

[6] Ritschel, B., Mende, V., Pfeiffer, S., Freiberg, S.

Semantic web technologies for value added services at the GFZ ISDC, GeoInformatics (Potsdam 2008)

[7] Deegree Project: http://www.deegree.org

[8] Global Change Master Directory: http://gcmd.nasa.gov/

[9] XML & XSLT: www.w3.org/TR/xslt , http://selfhtml.org/


[1] CHAMP; GRACE, TerraSAR-X, GASP, GNSS, GGP, GPS-PDR and others