GSA 2020 Connects Online

Paper No. 79-15
Presentation Time: 5:15 PM

LEARNING TO IDENTIFY LARGE FOSSILS USING DEEP CONVOLUTIONAL NEURAL NETWORKS


HATAYA, Ryuichiro, Department of Information Science and Technology, The University of Tokyo, Yayoi, Bunkyo, Tokyo, 113-8657, Japan, MATSUI, Kumiko, The Kyushu University Museum, Kyushu University, 6-10-1, Hakozaki, Higashi, Fukuoka, 812-8581, Japan and KARASAWA, Tomoki, the Mikasa City Museum, Ikushumbetsu Nishiki-cho 1-212-1, Mikasa, 0682111, Japan

Automation of intellectual labor has been a dream of humans, and a part of it is getting accomplished by Artificial Intelligence. Computer vision using Deep Convolutional Neural Networks (CNNs) is the most important driving force behind this trend. CNN-based methods achieve near-human or even super-human performance on various image recognition tasks by learning how to process data directly from raw data, e.g., image pixels. These methods assist human experts in a variety of fields, including medical image analysis. Similar to analyzing medical images, identifying large fossils is intellectual labor, but only the specialists of the specific taxon can identify them correctly.

An extinct bivalves clade of Inoceramidae is abundant from the upper Cretaceous in Japan. More than 450 occurrences of inoceramids (paleoDB Feb.2020) were reported from Japan. These inoceramids belong to more than four genera and six subgenera, and more than 50 species. These inoceramids are favorable for index fossils because almost all of these species have a short stratigraphic range. The diagnostic features of inoceramids are not limited to a specific region or structure in each shell. These features are not defined quantitatively. Because of these reasons, it takes long years of practice with experienced mentors to learn the skills in the identification of these fossils. As a result, the inoceramid taxonomy specialists are at risk of "extinction" in Japan. Indeed, only one or two active researcher(s) is (are) engaged in classification and systematics of inoceramids, despite the importance of inoceramids' identification. However, for most paleontologists, the identification of fossils is not the objective; instead, a "tool" for their research. Because of the difficulty of skill acquisition, these researchers cannot spare their time to learn the identification of inoceramids. There is a large gap between the "supply" and "demand" in the identification of inoceramids.

To overcome this situation, we propose to use CNNs to identify inoceramids. To this end, we collected 2D images of several species of Inoceramidae and trained CNN models to classify these images into their species correctly. Because the number of available specimens of fossils is limited, we also studied methods to virtually increase the amount of data for better classification accuracy. We also discuss the limitation of this approach and possible future direction.