GSA Connects 2024 Meeting in Anaheim, California

Paper No. 59-12
Presentation Time: 5:00 PM

AN OPEN, ONLINE REPOSITORY FOR BOTH LEGACY AND NEWLY GENERATED DATA FROM THE NEWARK BASIN CORING PROJECT (USA): A NEW FRAMEWORK FOR ENSURING THE LONGEVITY OF RECORDS FROM CONTINENTAL SCIENTIFIC DRILLING PROJECTS


KINNEY, Sean1, MCWHIRTER, Jeff2, TIBBITS, David3, CHANG, Clara4 and OLSEN, Paul E.4, (1)Department of Earth and Planetary Sciences, Rutgers University, 610 Taylor Rd, Piscataway, NJ 08854; Lamont-Doherty Earth Observatory of Columbia University, 61 Rt. 9W, Palisades, NY 10964, (2)Geode Systems LLC, Boulder, CO 80303, (3)Department of Earth and Planetary Sciences, Rutgers University, 610 Taylor Rd, Piscataway, NJ 08854, (4)Lamont-Doherty Earth Observatory, Columbia University, 61 US 9W, Palisades, NY 10964

Geologic data faces a risk of extinction or significant inaccessibility if not properly maintained in an open and accessible online repository. Data derived from core and cuttings collected from continental drilling projects over the last several decades is particularly vulnerable due to the inconsistent standards not only for data management but also for the curation of original physical materials. This is especially true for data from projects presently in the public domain but originating outside traditional funding mechanisms and without the requirement of long-term data management (e.g., oil & gas exploration, geotechnical, etc.). These data may have originally existed in a closed network or entirely in print format, and the responsibility of the curation of original physical samples and any derived data may be distributed across several parties including academic institutions, state surveys, or in many cases, individual researchers. The accelerating and expanding universe of data production from continental scientific drilling projects further underscores the need to prevent the formation of an expansive web of ‘dark data’ that we know exists but is not easily accessible.

We present a model that articulates a vision for a community-driven solution using the Repository for Archiving and MAnaging Data and Digital Artifacts (RAMADDA) framework. This approach, widely used in in other geoscience contexts, permits seamless integration of a full range of products, including borehole geophysics, multi-sensor core logs, XRF elemental scans, core/cuttings information, etc. An open source, community driven effort using RAMADDA will facilitate a living and evolving data ecosystem, integrating past data with new and ongoing work. This approach is aligned with the principles behind FAIROS (Findable Accessible Interoperable Reusable Open Science) providing an open repository with search and interactive services.

To demonstrate the power and utility of RAMADDA in this domain, we produced an open, online digital data repository for the Newark Basin Coring Project that unites all legacy products with any new datasets produced from this suite of seven cores. Given the flexibility and capacity for growth of this framework, this example is a starting point for dialogue with the community to standardize and centralize long-term data curation.