North-Central Section - 39th Annual Meeting (May 19–20, 2005)

Paper No. 3
Presentation Time: 8:40 AM

A DISTRIBUTED WORKFLOW DATABASE DESIGNED FOR COREWALL APPLICATIONS


KAMP, William P., Lumnilogical Research Center, Univ of Minnesota, 12904 Hamlet Ave, Apple Valley, MN 55124, kamp@iagp.net

The data required for a core interpretation session can be very large. An individual IODP core's data which includes core images, multi-sensor track/core-logger scans, smear slides and other related data can be in the 10 to 100 gigabyte range. If a scientist wants to simultaneously compare multiple cores the size of the required data can indeed be large. To compound this problem, many users will be interpreting at locations with slow internet connections. In addition, users may be interpreting data from databases that are often designed as read-only archives and not designed to hold ‘works in progress' of investigators.

The Core Workflow Database (CWD) addresses these two problems. First it will have interfaces to enable the CWD to retrieve user selected data from established databases such as JANUS, LacCore Vault, dbSEABED, and PaleoStrat. It will also pull data through the emerging portals such as CHRONOS. This gives fast cached access to multiple data sources.

Second, the CWD captures the results of analyses and interpretations. As the workflow is captured it can be accessed by other collaborators locally or remotely. For example, the work of one scientist can be shared from one shift to the next and biologists in a separate lab can watch as other labs in the ship interpret a specific core segment. In a higher bandwidth environment, such as a core lab or a university office, a group of collaborators could track the work of one-another as they work on the same cores.

A primary feature of the CWD is the co-registration of the data across multiple coordinate systems. For instance, once several wire length control points have been assigned depths, the remainder can be extrapolated, and depth is now stored as an alternate coordinate system for the well log. Or once several geologic boundaries are established in an image of a core, the geologic age is registered as an alternative coordinate system. CWD will provide this capability by leveraging well-known algorithms from SAGAN and SPLICER. Networks of CWDs can be connected to facilitate mirroring of data, and permissions can be set at each institution's database to control the degree of sharing. We intend to take advantage of existing technologies such as the Storage Resource Broker and Meta-data Catalog [SRBMDC] to facilitate the locating of replicated data-sets in the network of CWDs.