Geoinformatics 2007 Conference (1718 May 2007)
Paper No. 5-9
Presentation Time: 10:30 AM-10:45 AM


REES, Allister, Department of Geosciences, University of Arizona, 1040 E 4th St, Tucson, AZ 85721,, ALROY, John, Paleobiology Database, NCEAS, University of California, 735 State Street, Santa Barbara, CA 93101, SCOTESE, Christopher, PALEOMAP Project, University of Texas at Arlington, 700 Tanglewood Lane, Arlington, TX 76012, MEMON, Ashraf, San Diego Supercomputer Center, University of California, San Diego, 9500 Gilman Drive, Mail Code 0505, La Jolla, CA 92093-0505, ROWLEY, David B., Department of Geophysical Sciences, University of Chicago, 5734 S. Ellis Avenue, Chicago, IL 60637, PARRISH, Judith Totman, Department of Geological Sciences, University of Idaho, P.O. Box 443022, Moscow, ID 83844, WEISHAMPEL, David B., Department of Cell Biology & Anatomy, The Johns Hopkins University School of Medicine, 725 N. Wolfe St, Baltimore, MD 21205, PLATON, Emil, Energy and Geoscience Institute, University of Utah, 423 Wakara Way, Salt Lake City, 84108, O'LEARY, Maureen A., Department of Anatomical Sciences, Stony Brook University, Stony Brook, NY 11794-8081, and CHANDLER, Mark A., Center for Climate Systems Research, Columbia University, 2880 Broadway, New York, NY 10025

The Paleointegration Project (PIP) within GEON ( is facilitating interoperability between global-scale fossil and sedimentary rock databases, enabling a greater understanding of the life, geography and climate of our planet throughout the Phanerozoic. The key elements of PIP are databases, paleomapping tools and web services.  

The following databases are presently in the system (see table for temporal distributions): PBDB – The Paleobiology Database (~630,000 occurrences of marine invertebrates, vertebrates (including dinosaurs), plants, and microfossils from 69,000 collections); DINO – From "The Dinosauria" Encyclopedia, 2004 (~4,200 occurrences from 1,200 localities); GCDB – Graphic Correlation Database (~108,000 fossil occurrences from 2,000 localities, with over 700 interpreted localities that define taxon ranges); PGAP – The Paleogeographic Atlas Project (~135,000 sedimentary rock occurrences from 47,000 localities); CSS – Climatically significant sedimentary rocks (~13,000 occurrences from 3,600 localities); OSR – Organic-rich sedimentary rocks (~2,000 occurrences from 1,600 localities, including geochemical data).


The databases are text- as well as map- searchable, through the use of age and geography ontologies, linked to GIS mapping tools. Results are viewable on present-day maps as well as paleo- maps, and can also be downloaded for further detailed analyses. An important feature of PIP is the calculation of locality paleocoordinates "on the fly", based on modern latitude and longitude as well as locality age. This was achieved by developing a module (Auto Point Tracker - APT) that enables dynamic calculations of locality paleocoordinates and automated plotting on the paleogeographic maps. PIP was designed to ensure fast data retrieval, making it especially useful for extensive as well as multiple simultaneous searches (e.g. in a classroom setting). Through the use of web services developed at the San Diego Supercomputer Center (SDSC), the PBDB server in Santa Barbara is searchable dynamically within PIP, alongside the other databases hosted at SDSC. In addition to the PIP features, each PBDB locality record is linked back to the original database, enabling the user to either explore and analyze the data further within that system (e.g., using the PBDB statistical tools), or to continue within the PIP interface.  

The IT architecture of PIP (in the following figure) shows how the different databases and components interact to produce an integrated result and map. Researchers send a query for data using the PIP user interface and this query is parsed by the PIP middleware. It is then decomposed into several different queries sent to the databases, using Java Databases API for the ones hosted on SDSC servers and using Web Services for querying data hosted at remote locations (e.g., the PBDB). Once all the results are accumulated, they are passed to the Auto Point Tracker (APT) Web Service for "on the fly" calculations of locality paleocoordinates so that they can be plotted on the paleogeographic maps.


The Paleointegration Project is already proving useful to researchers, teachers and students. Anyone can now access data and tools that were only available previously to a few specialists. It should also prove to be an excellent resource for a new generation of projects that assimilate both paleoclimate models and data for more detailed views of the Earth's climate history. Complex computational tools like Global Climate Models (GCMs) simulate details of the Earth's atmosphere, oceans and land surface processes that are beyond what proxy interpretation alone can provide. However, modelers require detailed paleogeographic data, which is used to construct the type of boundary conditions used as GCM input. The PIP databases will also be useful to paleoclimate modelers needing access to proxy climatological data for use in model verification. Accessibility to GCMs is now improving through programs like the Educational Global Climate Modeling Project (see EdGCM at Thus, accessibility to paleo databases is crucial to the development of quality data/model integration projects. We emphasize that PIP doesn't replace specialist expertise. It does, however, provide another means whereby researchers can develop their own scientific queries.  

We envisage continuing to develop PIP with the addition of new datasets, tools and services. The next phase will include the MorphoBank Database, which contains phylogenetic systematics of morphological data (~2,300 anatomical images; 23 phylogenetic matrices). The PIP will enable phylogeneticists currently using MorphoBank to search seamlessly for relevant fossil taxa in the PBDB and to generate, for example, survivorship plots of diversity through time. It will also include the online PaleoReefs Database (PARED) developed by Wolfgang Kiessling (Humboldt-University of Berlin), which contains data for 3,550 reef complexes throughout the Phanerozoic (with details about paleontology, architecture, environmental setting and petrography). Some 35,000 reefal taxonomic occurrences from PARED are stored in the PBDB, so seamless integration of the additional reef architectural and environmental details will be very useful. More tools and services will be required as additional databases are integrated, to ensure that diverse user needs and interests are addressed. We plan to work with projects like PaleoStrat to develop some of these new tools and databases within PIP.  

A more complete understanding of the interactions between Earth and Life through time also requires the integration of geochemical, geophysical, igneous and metamorphic data. From this broader "geoinformatics community" perspective, the Paleointegration Project demonstrates that disparate geologic databases residing on different servers can be searched seamlessly using embedded modules, web services and GIS – structures and tools that will greatly facilitate future collaborative efforts.

Geoinformatics 2007 Conference (1718 May 2007)
Session No. 5
Geoinformatics Oral Session III
University of California: Second Auditorium
8:15 AM-3:00 PM, Friday, 18 May 2007

© Copyright 2007 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.