GSA Annual Meeting in Denver, Colorado, USA - 2016

Paper No. 127-13
Presentation Time: 4:45 PM


ROSS, Jake, New Mexico Geochronology Research Laboratory, New Mexico Bureau of Geology and Mineral Resources, 801 Leroy, Socorro, NM 87801 and MCINTOSH, William, New Mexico Bureau of Geology and Mineral Resources, New Mexico Tech, 801 Leroy Place, Socorro, NM 87801,

The success of modern geochronology has been largely facilitated by software, both for automated data collection and processing. As datasets continue to grow the desire for high quality, high throughput data management and processing tools is also on the rise. Enter Pychron. Pychron is an open-source, python based, fully featured geochronology suite developed at New Mexico Geochronology Research Laboratory. The next generation of Earthtime science will require sophisticated, transparent, modular and widely accepted software, all of which are goals Pychron was designed for and continues to fulfill.

Pychron uses an innovative data management technique called Data Version Control (DVC). DVC uses the nearly ubiquitous Version Control System (VCS), Git, to provide robust data tracking, versioning, and distribution. These features are commonly used during software development, but we recognized a significant overlap between the worlds of data processing and software development. DVC provides users both with high quality data management features seen in a traditional Relational Database Management System (RDBMS), with the peace of mind of a traditional VCS. It is impossible for analyses in the DVC system to change without the system knowing about; there is no such guarantee with conventional relational databases. In addition to tracking changes, users can make rapid comparisons between different “versions” of the same analysis.

In addition to DVC, Pychron uses a “pipeline” based model for processing. Similar to other software pipelines, Pychron’s pipeline consists of various reusable nodes connected to one another in series. Each node performs a specific task then passes the results to the next node. The pipeline model is flexible and provides the opportunity to bulk process large and small datasets alike. 

Lastly Pychron, provides an opportunity to systematically compare the various data processing applications currently available. Pychron can directly access MassSpec databases and efficiently preform one-to-one comparisons of results produced by both MassSpec and Pychron. A standardized data format is in development to ease the sharing of data between applications opening up the possibility of quantitative systematic comparison with other applications, such as ArArCalc.