2008 Geoinformatics Conference (11-13 June 2008)

Paper No. 5
Presentation Time: 11:40 AM

TOWARDS AN OPENEARTH FRAMEWORK (OEF)


BARU, Chaitan1, KELLER, G. Randy2, NADEAU, David R.1 and MORELAND, John L.1, (1)San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, (2)School of Geology and Geophysics, University of Oklahoma, 100 E. Boyd, Norman, OK 73019, baru@sdsc.edu

Integrating data and toolkits

In 2002, the Geosciences Network (GEON) brought together 16 institutions to develop an infrastructure for managing distributed collections of large, heterogeneous, multidisciplinary datasets. In the next generation of that work, this infrastructure is expanding to include open source software for integrating, analyzing, and visualizing these data sets. Based upon a community-driven set of open standards for data models and services, we call this software suite the OpenEarth Framework (OEF). A principal focus of this work is integration that spans:

  • Data types, from LiDAR to satellite imagery, DEMs, bore hole samples, velocity models from seismic tomography, gravity measurements, and simulation results.
  • Data storage schemes, from file systems to databases and archival systems, such as the Storage Resource Broker (SRB).
  • Data delivery methods, from local files to database queries and web services, such as WMS, WFS, and services for new data types, like large tomographic volumes.
  • Data formats, from Shape files to NetCDF, GeoTIFF, and other formal and defacto standards.
  • Data models, from 2D and 3D geometry to semantically richer models of features and relationships between those features.
  • Data coordinate spaces and dimensionality, including 2D and 3D spatial representations and time scales that may span hundreds of millions of years.

There are several good toolkits that address portions of this space, including the GeoTools, GeoTIFF libraries, Shape file parsers, and many web services toolkits. The OEF seeks to integrate these within a common framework. By spanning multiple toolkits, the OEF can grow as the sum of these toolkits grows. The OEF can also remain toolkit agnostic and expand to support additional toolkits as they emerge.

Visualizing data

The OEF also is addressing gaps between these toolkits. Chief among these is richer support for interactive 3D visualization.

While many 3D visualization tools exist, they often have built-in assumptions that limit their use within geoscience contexts. Google Earth, for instance, is a fascinating tool for exploring surface features, but it has no support for diving beneath the surface. Tools tuned for time lines on the scale of earthquake cycles may be insufficient to handle deep time for exploring the evolution of the lithosphere and linkages between such phenomena as orogeny and climate. Other tools may do very well when data sets are small enough to fit entirely in memory, but they stumble on large high-resolution data spanning continents and millions of years.

With the OEF's focus on integrating data spanning the geosciences, it is important to develop an open software architecture and corresponding software that can properly manipulate and visualize the integrated data. The OEF's software stack to do this extends from deep within web-available data archives outwards to interactive visualization tools running on the user's desktop or laptop computer.

At the deepest level, Dataset Access Services manage and deliver stored data and metadata. These services hide storage details, such as the storage medium, Internet location, administrative domain, access authentication, data replication, and storage optimization. Stored metadata characterizes registered data, including its spatial and temporal extent, resolution, and history. Of particular importance is a data derivation tree that links together original data and data derived through format conversion, subsetting, resampling, or other analysis.

At the next higher level, Data Modeling Services provide on-demand and preprocessing operations on archive data. These operations may automatically subset data to extract data to satisfy a web services query. Services may cache the extracted data for future re-use, pre-extract data of expected interest, and otherwise manage the data to enable fast fluid data delivery. It is at this level that the OEF implements web services for the delivery of images, features, or volumetric data.

Above this level are Data Interaction Services designed to support rapid visualization of integrated data sets. For instance, services here create multi-resolution models that enable visualization tools to smoothly zoom into data by swapping low-resolution data for higher resolution data on the fly. These services also subdivide data to better support progressive changes to the display as the user pans through large data or reveals additional detail of interest. Derived data may be cached, staged at closer network locations, or downloaded in the background to the user's computer. While 3D rendering and interaction remains on the user's own computer, these services help reduce delays as the user explores deep data archives.

Finally, the OEF's Visualization Tools run on the user's computer and use 3D graphics acceleration hardware to display points, lines, polygons, volumetric data, animations, isosurfaces, cutting planes, and so forth. Continuing the inclusive style of the OEF, the open architecture supports multiple visualization tools authored throughout the community. GEON's Integrated Data Viewer, for example, provides an existing mature platform that is being extended to use OEF's layered data services. Additional visualization tools are being developed to drive higher-level OEF data services and explore new visualization techniques and user interface styles for interacting with integrated data sets.

OEF visualization tools will provide user interfaces that support spatial and temporal queries sent off to one or more data archives. Query results will be presented within integrated views that combine, for instance, surface elevations derived from LiDAR, surface colors from satellite imagery, bore hole paths, cutting planes through seismic tomography data, and other subsurface structure derived from analysis and simulations.

While the various OEF data services above will certainly be used, they are optional. Not all data of interest is in a published data archive. OEF visualization tools also must be able to visualize new unpublished data stored locally. These may be loaded along side published data from academic archives, enabling comparisons between new and established views of subsurface structure.

Developing open source

Admirable visualization tools already exist in specific highly commercial domains. Tuned to the needs and budgets of those domains, these tools can produce beautiful imagery. However, they may be less suitable for supporting current research, providing a flexible test bed for new data models and visualization ideas, or integrating with academic data archives. Instead, open source software is needed that can provide the necessary flexibility for academic research. Open source benefits also include community participation and contribution, and the creation of a robust developer and user community. In the end, this ensures both the flexibility and longevity of the software base, thereby creating a lasting community asset.

Project Plans

We plan to begin with a sample set of heterogeneous data sets for a given geographic region of interest where a variety of data is currently available. Using this data as a test case, we will develop software to enable visualization of the combined information and the ability to interactively access and manipulate the underlying data.