| Geoinformatics 2007 Conference (17–18 May 2007) | |
| Paper No. 5-14 | |
| Presentation Time: 11:45 AM-12:00 PM | ||
SODA — A SELF-SERVICE ONLINE DIGITAL ARCHIVE FOR UNLOVED SCIENTIFIC DATA | ||
|
SANDERS, Rex, USGS, 400 Natural Bridges Drive, Santa Cruz, CA 95060, rsanders@usgs.gov SODA (Self-service Online Digital Archive) is a project under development at USGS for archiving "unloved" scientific data.
Background
USGS collects many thousands of gigabytes of new data annually. Many data types have well-defined processing and archiving paths, but many do not — our so-called "unloved data". Unloved data types usually fall into two classes: data types that have not traditionally shown national significance (e.g., marine sediment analyses), and data types created from new technology and research (e.g., airborne and land-based LIDAR surveys). Unloved data are difficult to find, difficult to access, and often vanish completely upon the retirement or departure of key scientists and technicians.
Scientists with the best intentions frequently fail to archive their data well. One scientist carefully created and labeled three copies of digital core photos on a total of 90 CD-R disks. Three years later, none of these disks were readable, because the label adhesive had corrupted the data layer. As a non-professional archivist, he did not know the risk associated with using sticky labels on CD-R disks.
Goal and Use Cases
The SODA project wants to make archiving unloved scientific data easier than burning another CD-R.
We are building our system around two use cases — archiving data and finding data.
To archive data:
To find data:
Design Features
SODA will have other features, including:
Benefits
Some of the anticipated benefits of SODA include:
SODA is intended to be the scientific data archive of last resort, dependent on the cooperation of overworked scientists and technicians to keep valuable data from being lost forever. As such, we cannot require very much effort to archive the data; the process must be simple and self-explanatory. Our reduced metadata and approval requirements disappoint many mainstream data archivists, but capturing more data with some metadata is better than capturing no data.
SODA is not intended to replace any other USGS data archiving mechanism, including Open File Reports, Data Series, or online databases like NWIS (http://waterdata.usgs.gov/nwis)
Technical Design
SODA technical design is based on several principles:
Current Status of the SODA Project
SODA has been under development with minimal funding since early 2006. After surveying commercial and open-source projects, we concluded that writing our own software and designing our own system would best meet our needs. We have a core developer team with three members, and an email-based advisory group with about 45 members, all working at the USGS.
We have an initial, non-archival prototype running. We are using the prototype to work out many technical, user-interface, process, and procedural issues.
We anticipate release of our first production SODA server by the end of 2007. A few months after that release, we anticipate release of the SODA "cookbook" to enable other sites to setup and run local SODA servers. Development of the central search system and other SODA features is unscheduled, dependent on acquisition of further resources.
We will consider joint development of SODA with non-USGS partners.
| ||
|
Geoinformatics 2007 Conference (17–18 May 2007)
| ||
| Session No. 5 Geoinformatics Oral Session III University of California: Second Auditorium 8:15 AM-3:00 PM, Friday, 18 May 2007 | ||
© Copyright 2007 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions. | ||