2011 GSA Annual Meeting in Minneapolis (912 October 2011)
Paper No. 68-5
Presentation Time: 2:50 PM-3:05 PM

FROM DARKNESS TO LIGHT: THE LONG TAIL OF SAMPLE-BASED DATA IN THE NEXT DECADE

LEHNERT, Kerstin A., Lamont-Doherty Earth Observatory, Columbia University, 61 Route 9W, Palisades, NY 10964, lehnert@ldeo.columbia.edu and WALKER, J. Douglas, Department of Geology, University of Kansas, Lawrence, KS 66045

Sample-based observations such as geochemical and geochronological data are part of the ”Long Tail of Science Data” as described by Heidorn (DOI: 10.1353/lib.0.0036). Heidorn organizes science projects along an axis from large to small, with very large projects supporting dozens or more scientists on the left side of the axis, and smaller projects sorted by decreasing size trailing off to the right. The area under the right side of the curve is the ‘long tail of science data’, where the majority of scientists produces many small and heterogeneous datasets that are poorly curated, not shared, and often lost, even though this long tail holds greater potential for innovative and transformative science.

Sample-based observations are part of the long tail. They usually come as small datasets, acquired by idiosyncratic data collection practices and organized in customized formats. These data are shared primarily through publication in the scientific literature, but often data in publications are incomplete or only presented in diagrams, not in tabular form. Metadata about data quality, analytical procedure, and samples that are critical for the re-use of the data are often poorly documented. Many data never leave the ‘darkness’ of the investigator’s hard-drive or desk-drawer. Bringing the dark data into the light is an essential component of building comprehensive Geoscience cyberinfrastructure for the next decade.

New data services and tools are developed by the Geoinformatics for Geochemistry Program (www.geoinfogeochem.org) as part of the Integrated Earth Data Applications (IEDA, www.iedadata.org) data facility, to support the management of sample-based data, and to offer incentives to investigators to share their data. Tools and services that support the workflow of geochemical data management from data acquisition and metadata capture in the lab to data reduction, data analysis and visualization, to data publication and data submission to repositories in compliance with funding agency policies can advance preservation and access of geochemical, geochronological, and other sample-based data across the Geosciences.

2011 GSA Annual Meeting in Minneapolis (912 October 2011)
General Information for this Meeting

Handouts:

Session No. 68
Data Preservation and Management in the Coming Decade
Minneapolis Convention Center: Room 101DE
1:30 PM-5:30 PM, Sunday, 9 October 2011

Geological Society of America Abstracts with Programs, Vol. 43, No. 5, p. 185

© Copyright 2011 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.