2008 Geoinformatics Conference (11-13 June 2008)

Paper No. 12
Presentation Time: 3:00 PM

NEPTUNE - DEVELOPING A DIGITAL INFORMATION INFRASTRUCTURE FOR MICROPALEONTOLOGY IN THE 21st CENTURY


LAZARUS, David, Institut für Paläontologie, Museum für Naturkunde, Invalidenstraße 43, Berlin, D-10115, Germany, CERVATO, Cinzia, Dept. of Geological and Atmospheric Sciences, Iowa State Univ, 253 Science I, Ames, IA 50011, FILS, Douglas, Geological & Atmospheric Sciences, Iowa State Univ, 253 Science I, Ames, IA 50011 and DIVER, Patrick, DivDat Consulting, 1392 Madison 6200, Wesley, AR 72773, david.lazarus@rz.hu-berlin.de

Marine microfossil occurrences are extensively used for geologic age determination, paleoceanographic and other paleoenvironmental research. They are less commonly used for studies of evolution, despite having one of the best-preserved records of evolutionary change in the entire fossil record. Although marine microfossils have been extensively studied from rock formations on land, several decades of deep-sea scientific drilling have also accumulated a large archive of information on marine microfossil occurrences, particularly for the pelagic unicellular plankton groups diatoms, radiolarians, coccolithophores (‘nannofossils') and planktonic foraminifera. Despite extensive comparisons (mostly monographic) of occurrence information for selected individual biostratigraphically important species, there has been no more general synthesis of the fossil data collected by deep-sea drilling, nor any appropriate tools such as taxonomically and age controlled occurrence databases, which are necessary for effective synthesis of fossil occurrence data. Similar databases, such as the Sepkoski database for marine invertebrate fossils, have played a central role in the development of invertebrate paleontology for many years. Beginning in the early 1990s, a new database system - Neptune - was developed to address this problem, originally at the ETH Zürich, and subsequently as part of the NSF funded Chronos project, based at Iowa State University (USA).

Neptune is a relational database and a set of external tools that link together raw occurrence data for marine microfossils, as given in several hundred selected original range charts of deep-sea drilling science reports, to the essential scientific information needed to effectively retrieve and synthesize these data. These include numeric geologic ages for every occurrence, based on quantitative age models for every sample/hole in the system, and master taxonomic name lists that link synonyms for the same taxa concepts to each other, and distinguish different taxonomic identification quality occurrence records (e.g., clearly identified vs .cf or ‘?' observations) from each other. Neptune thus allows data to be retrieved from this important archive in a form suitable for large-scale synthesis of the deep-sea marine microfossil record, and provides tools for summarizing the information. More recently, Neptune has been linked to the successor of the Sepkoski database - the Paleobiology Database (PBDB), allowing microfossil data from land sections to be combined with data from marine sections. The system is currently being used to study large-scale patterns of Cenozoic evolutionary change in the plankton, and as an age model and taxonomic reference library for other users of deep-sea drilling sections.

The current implementation of Neptune is as a PostgreSQL relational database hosted on the Chronos server stack at ISU. It is searchable through the Chronos portal and seamlessly integrated with the Java-based Age Depth Plot and the Age Range Chart applications.

Analysis of large, heterogeneous datasets inevitably raises problems of mixed data quality, with data gaps and generally uneven sampling, outliers and incorrectly entered primary observations all affecting the validity of analyses. Via the link with PBDB, Neptune analyses can make use of PBDB's large library of paleobiologic tools for dealing with unevenly sampled data (range through and subsampling, etc). Current work is developing tools for dealing with age outliers in taxon ranges created due to taxonomic errors in the original data, reworking of fossils, as well as age model errors due to poorly resolved or mutually inconsistent primary chronostratigraphic information.

Future development of this system is envisioned as part of a gradually evolving network of digital resources in marine micropaleontology. These include stronger links to the primary deep-sea sediment core databases such as ODP's Janus system, the addition of biostratigraphic and lithologic data from all ODP sites, links to digital taxonomic catalogs of species images and descriptions (one such link has already been developed by Chronos), and to major collections of marine microfossil materials held in Museums and other institutions, such as the Micropaleontological Reference Center network of deep-sea marine microfossil slides. Effective networking of these resources will require developing funding mechanisms to maintain and regularly update a central registry of the shared key linking field data - the taxonomic and age model information. The benefits for research however will be substantial, offering major increases in data synthesis capacity, particularly for studies of global, long time scale processes, and improved efficiency in data retrieval and analysis in many other individual micropaleontologic research projects.