GSA Connects 2022 meeting in Denver, Colorado

Paper No. 104-9
Presentation Time: 3:45 PM

USING ARTIFICIAL INTELLIGENCE TO IDENTIFY FOSSIL ANGIOSPERM LEAVES AT FAMILY LEVEL


RODRIGUEZ RODRIGUEZ, Ivan Felipe, Cognitive, Linguistic & Psychological Sciences, Brown University, 190 Thayer Street, PROVIDENCE, RI 02912, FEL, Thomas, ANITI, Tolouse, 00000, France; Cognitive, Linguistic & Psychological Sciences, Brown University, 190 Thayer Street, PROVIDENCE, RI 02912, VAISHNAV, Mohit, Cognitive, Linguistic & Psychological Sciences, Brown University, 190 Thayer Street, PROVIDENCE, RI 02912; ANITI, Tolouse, 00000, France, WILF, Peter, Department of Geosciences and Earth and Environmental Systems Institute, The Pennsylvania State University, University Park, RI 16802 and SERRE, Thomas, ANITI, Tolouse, 00000, France; Carney Brain Institue, Brown University, Providence, RI 02906; Cognitive, Linguistic & Psychological Sciences, Brown University, 190 Thayer Street, PROVIDENCE, RI 02912

The identification of fossil angiosperm leaves poses a well-known challenge for paleobotany. Recent progress in computer vision offers a path towards the development of AI agents to assist paleobotanists, but several challenges have slowed progress. Images of taxonomically vetted fossil leaves are very scarce in comparison to the need of AI for enormous visual training libraries, and their quality is highly variable due to preservation and taphonomic factors. To overcome these limitations, we have developed a deep generative model that learns to automatically synthesize photorealistic fossils from known cleared and x-ray extant-leaf Images. Given the considerable amount of unvetted images available in different collections ( Such as the Yale Peabody Museum collection and others), we have also managed to leverage the usage of unsupervised training, extending the possibility of generalization of our synthesizer model. We further use these synthetic fossils to train a deep neural network architecture to learn to classify both cleared leaves and fossil leaves at the family level. Using a leave-one-family-out cross-validation procedure to evaluate accuracy (whereby real leaf fossils are excluded from training for a test family), we report well above chance-level accuracy at family level for real fossil leaves, even when the system did not see any real fossils during training. We report significantly above-chance classification accuracy in this scenario. As well, a study using explainability methods is carried out in order to identify some of the strategies used for the classification. Our results strongly suggest that AI methods will provide significant assistance to paleobotanists with the identification of leaf fossils. We are also shortly releasing a website for the community where it is possible to upload fossils from any part of the world and test first hand the potential of our approach.