GSA 2020 Connects Online

Paper No. 54-1
Presentation Time: 10:00 AM

INTO THE ISOVERSE: OPEN-SOURCE DATA TOOLS FOR STABLE ISOTOPE RATIO MASS SPECTROMETRY (Invited Presentation)


KOPF, Sebastian, Geological Sciences, University of Colorado Boulder, Boulder, CO 80309

Novel applications of stable isotope ratio mass spectrometry are frequently limited not just by the hardware or the difficulty of the analytical task, but by the lack of easy access to the raw data and instrument parameters recorded by the vendor software. We present a suite of open-source data tools for stable isotope data (www.isoverse.org) written in the R programming language (and easily accessible from Python) that provides direct scripting access to raw IRMS data from multiple vendor file formats. This allows maximum flexibility and transparency in data reduction from start to finish (e.g. 17O corrections, reference frame transformations, standard calibrations, etc.), easy and efficient comparison of different algorithms and approaches, as well as rapid execution of entire data processing pipelines and transfer to data repositories.

The key to reproducible data processing in the development of new analytical tools – and in scientific research in general – is to produce a faithful record of every step of the process in a format that is transparent and easy to understand. This is not an easy task. In the stable isotope community, two important challenges in adopting flexible, efficient and reproducible data reduction workflows and data are #1: the lack of many basic computational and data access tools that enable the kinds of calculations and flexible data processing we need to do on a day to day basis, particularly for newer analyses that commercial software was not written to accommodate; and #2: data processing steps that require proprietary software depend on point-and-click interactions, and/or include black box steps where the inner workings are inaccessible to the user. This makes it challenging to share and discuss one’s approach, review others’ and compare calculations and datasets across laboratories. Moreover, it severely restricts opportunities for iteration, exchange of ideas, and data aggregation. We propose that the solution lies with the development of open-source community software that enables reproducible and transparent data processing directly from raw analytical output through all steps of data reduction, quality control, visualization and data reporting, while retaining the necessary flexibility to be useful despite the enormous breadth of analytical goals in the stable isotopes community. The tools introduced here are aimed at making meaningful progress towards this goal and creating pathways for efficient transfer of data to centralized repositories such as the Isobank initiative.