GSA Annual Meeting in Denver, Colorado, USA - 2016

Paper No. 105-1
Presentation Time: 8:00 AM

EVOLUTION OF A DATABASE: FROM EXCEL TO SPECIFY AND THE ERA OF MODERN DATA PUBLISHING


KARIM, Talia S., University of Colorado Museum of Natural History, University of Colorado, 265 UCB, Boulder, CO 80309 and SMITH, Dena, Sedimentary Geology and Paleobiology, National Science Foundation, 4201 Wilson Blvd., Arlington, VA 22230, talia.karim@colorado.edu

The University of Colorado Museum of Natural History (CUMNH) Paleontology section has been at the forefront of electronic specimen databasing. Peter Robinson (curator emeritus) began entering specimen records into excel spreadsheets in the early 1990s, recognizing the need to quickly search and retrieve data about specimens in the CUMNH Paleontology Collection. Over the next 20 years, the digitized collections records were migrated into an Access database, with the fossil invertebrate and plant records subsequently migrated into Specify 5 in 2006 and published online using DiGIR in 2007. The latter were migrated a second time into Specify 6 in 2011 with a local webportal and IPT instance setup shortly thereafter to facilitate publishing. The multiple conversions have resulted in many of the data fields being formatted as open text fields and/or the data not residing in the appropriate table columns in the SQL backend of Specify. These issues are slowly being rectified by copying data back into the correct table columns and cleaning these data as needed. In some cases, fields must be added to the forms in Specify so that users can actually view the data. As part of this process, we are standardizing data with DarwinCore when appropriate. This will make sharing these data with aggregators such as iDigBio and data discovery by end users much easier. Thus far, collection and determination dates have been copied into the correct columns in SQL and a script used to convert the dates into an ISO standard format; minimal cleaning of dates has been done by hand. Other fields that will have data cleaned and/or moved include type status and an open comment field. The latter contains data that will be parsed out into at least one separate field. Several additional fields currently on our forms have been identified as containing no data at all. We are in the process of evaluating whether to keep these fields for future use or remove them from our forms entirely. Lastly, we plan to significantly modify our locality forms to accommodate georeferencing specific fields as part of our digitizing efforts on the Fossil Insect Collaborative and Cretaceous World TCNs.