Paper No. 1
Presentation Time: 1:45 PM

THE PROMISE, AND THE CHALLENGE, OF AUTOMATED SPECIES IDENTIFICATION


MACLEOD, Norman, The Natural History Museum, Cromwell Road, London, SW7 5BD, United Kingdom, N.MacLeod@nhm.ac.uk

At present the consensus among the geoscience community is that palaeontologists who are considered experts in the taxonomy of an organismal group can be relied upon to supply species identifications that are 100% accurate 100% of the time. This assumption is absurd and flies in the face of all the (admittedly small number of) empirical studies that address this issue. It is, however, a pragmatic response to the equally widespread assumption that there is no alternative to the present system of Gestalt-based identifications made by ‘experts’ who have experience in this area. But is this really the only alternative?

Techniques for automating some or all of the steps of the taxonomic identification procedure have been available for over 100 years. In many cases quantitative criteria are fundamental aspects of species descriptions. Yet few (if any) routine identifications of any fossil taxon made by experts use either morphometric criteria or quantitative methods of comparing images to sets of authoritatively identified illustrations. Despite this lack of objectivity and quality control, species and higher taxonomic category identifications form the basic data of palaeontology; embodying the foundation on which virtually the entire field stands, or falls.

Results from a suite of investigations into the levels of accuracy, consistency, and speed of current technological and algorithmic approaches to this problem (e.g, multivariate analysis of form factors, geometric morphometric analysis of landmark configurations and /or outlines, approaches that combine machine learning with computer vision) undertaken over the last five years indicate that virtually any quantitative procedure delivers results that are more accurate, more consistent, and achieved with greater speed than single or collective groups of human expert(s). These techniques are not appropriate for all fossil groups. But they can be employed with confidence in many contexts. Moreover, as palaeontology is “… running out of systematic [taxonomists] who have anything approaching a synoptic knowledge of a major group of organisms.’ (Kaseler 1993), the improvement of automated identification systems represents the only viable hope for preserving, much less developing, research-level taxonomic expertise for many fossil groups into the future.