AUTOMATED SOLUTIONS FOR GEOREFERENCING AND VECTORIZING GEOLOGICAL MAPS
The U.S. Geological Survey (USGS) in partnership with DARPA, NASA-JPL, and MITRE launched a pilot project to engage the artificial intelligence and machine learning communities in the recovery of legacy geologic map data through an open competition. Two challenges related to map processing were designed: 1) automated georeferencing of map images and 2) legend-based feature extraction. Using an extensive training and validation dataset sourced from the National Geologic Map Database (NGMDB), competitors combined computer vision, image segmentation, text recognition, and other machine learning techniques to provide code-based, open-source, automated solutions for the two challenges. These solutions were evaluated using standard metrics.
The novel solutions developed during the competition phase of the pilot project represent promising methods for rapidly extracting detailed geologic information embedded within map images. With over 100,000 maps in the NGMDB and thousands of additional maps in USGS publications, industry technical reports, and scientific literature, the application of these solutions would greatly accelerate the ability of the USGS to assimilate and derive additional value from source documents. Publication of the competition results, including the automated code as well as training and validation datasets, in publicly accessible repositories will greatly benefit the broader geoscience community.