GSA Annual Meeting in Seattle, Washington, USA - 2017

Paper No. 99-13
Presentation Time: 11:00 AM

THE QUANTITATIVE FOOTPRINT OF NATURAL HISTORY COLLECTIONS AND SPECIMENS IN THE SCIENTIFIC LITERATURE


SYVERSON, V.J.P. and PETERS, Shanan E., Department of Geoscience, University of Wisconsin-Madison, 1215 W Dayton St, Madison, WI 53706, vsyverson@gmail.com

Scientific collections are the basis of much of the natural sciences, serving as both sources of critical new data and archives that enable the validation of previous work. Although it is generally appreciated that museums and museum specimens are a critical source of new scientific knowledge, the aggregate impact of natural history collections in the published scientific literature remains largely unquantified. This is unfortunate both from the points of view of understanding scientific practice and justifying expenditures of resources in the maintenance and improvement of natural history collections. In response to the latter, many museums attempt to track publications resulting from work in their collections, but these data are dispersed and difficult to aggregate and no measure of specimen/collection impact yet exists. Here we use the GeoDeepDive (GDD) digital library and machine reading system to quantitatively assess the frequency with which museum specimens are referenced in scientific publications. Full-text content within the GDD library was first indexed with a pre-compiled list of natural history museum names and associated abbreviations to extract candidate mentions of specimens from the literature. Simple regex-type rules were then applied to strings adjacent to abbreviations to identify candidate specific specimen numbers. The resulting list of specimen numbers in the literature was then used to compute the number of specimens mentioned per year, the proportion of articles referencing specimens by year, and the frequency of use of natural history collections, and to identify the individual specimens with the highest overall scientific impact. The highest-impact specimens are mainly dinosaur holotypes, the foremost of these being the Apatosaurus holotype, CM 3018; the exception is the Allende meteorite, USNM 3529.Overall, specimens were cited in the highest proportion of articles in the mid-20th century, and has leveled off since the 1990s, possibly due to the increasing frequency with which specimen numbers are put into supporting online tables and other material not ingested by GDD. Even so, the total number of specimens cited per year has been growing, as has the frequency of references to collections outside North America and Europe.