GSA Connects 2023 Meeting in Pittsburgh, Pennsylvania

Paper No. 127-2
Presentation Time: 1:55 PM

LEVERAGING METRIC LEARNING FOR ROBUST IMAGE CLASSIFICATION: A CASE STUDY ON CHEILOSTOME BRYOZOANS (Invited Presentation)


PORTO, Arthur, Florida Museum of Natural History, University of Florida, Gainesville, FL 32603 and DI MARTINO, Emanuela, Natural History Museum, University of Oslo, Sars gate 1, Oslo, 0562, NORWAY

In the intersection of paleobiological science and artificial intelligence (AI) research, species classification emerges as a fundamental and yet remarkably challenging task. The inherent diversity and complexity of living organisms creates the need for models capable of handling data not included in supervised training sets - a significant challenge to conventional AI methods. Conventional image classification models typically demonstrate strong performance within their training distributions; however, they tend to severely underperform when applied to out-of-distribution data. These issues are especially prominent in the study of marine invertebrate taxa, where extensive morphological and habitat diversity, coupled with limited sampling and scarce ecological information, exacerbates the taxonomic complexities.

To address these challenges, we here propose the use of metric learning-based techniques, such as supervised contrastive learning (SupCon). Unlike traditional models that focus on optimizing class predictions, SupCon facilitates the learning of trait spaces that draw biologically similar instances closer together and separate dissimilar ones. This strategy, while still leveraging known species labels during training, has the potential to mitigate numerous issues inherent to traditional species classification models. The efficacy of this approach becomes particularly apparent when extended to include fossil taxa, where data sparsity poses a significant challenge.

Our preliminary experiments within the order Cheilostomata suggest that models trained using metric learning methods yield promising results in terms of classification accuracy, not only within and across genera but also for out-of-distribution data. We posit that the creation of such robust trait spaces can substantially enhance the integration of AI in paleobiological species classification, positioning metric learning as a viable and potential alternative to traditional image classification approaches.