GSA Connects 2022 meeting in Denver, Colorado

Paper No. 151-8
Presentation Time: 10:15 AM

LEVERAGING MACHINE LEARNING TOOLS TO UPDATE THE NATIONAL CARBON SEQUESTRATION DATABASE (NATCARB)


MORKNER, Paige1, BAUER, Jennifer2, PANTALEONE, Scott1, SHAY, Jacob1, BAKER, David Vic3 and ROSE, Kelly4, (1)Department of Energy, Support Contractor for the National Energy Technology Laboratory, 1450 SW Queen Ave, Albany, OR 97321, (2)US Department of Energy, National Energy Technology Laboratory, 1450 Queen Ave SW, Albany, OR 97321, (3)Department of Energy, Support Contractor for the National Energy Technology Laboratory, 1450 SW Queen Ave, Albany, OR 97321; Mid-Atlantic Technology, Research & Innovation Center, Morgantown, WV 26505, (4)Department of Energy, National Energy Technology Laboratory, 1450 SW Queen Avenue, Albany, OR 97321

The National Carbon Sequestration Database (NATCARB) was developed using inputs from the regional carbon sequestration partnership (RCSP) and was last updated in 2015, with the release of the U.S. Department of Energy’s (DOE) National Carbon Storage Atlas – Fifth Edition. NATCARB data includes information about deep saline basins, depleted oil and gas fields, and unminable coal seams, identified as being potential geologic carbon dioxide (CO2) storage (GCS) reservoirs in the USA and Southern Canada. In 2018, the geodatabase was integrated into the National Energy Technology Laboratory’s (NETL) Energy Data eXchange (EDX), the NETL-led DOE data repository and digital laboratory to support the Office of Fossil Energy and Carbon Management, as the NATCARB Viewer 2.0. The NATCARB Viewer was designed to enable users to visualize, explore, and download the NATCARB database layers. The novel NATCARB database has been cited over a hundred times in the literature, used for regional characterization, site screening, storage capacity calculations, and subsurface modeling.

Despite the wide reuse of the database, the legacy dataset still has gaps that require mitigation to maximize the value of the data (Morkner et al. 2022). To mitigate data gaps, a machine learning tool developed at NETL, SmartSearch (Baker et al. 2021), collected thousands of data resources based on the most common terms parsed from the database, as well as a list of key terms developed by subject matter experts at NETL. The results of the search were scored based on similarity and results over 10% similarity were integrated into a PostgreSQL database, where the text bodies were made searchable.

Missing properties from the original data source, when possible, were integrated into saline storage data layers within the NATCARB database including basin names, reservoir porosities, reservoir permeabilities, reservoir depths, and reservoir thicknesses. The update integrates new information into the legacy dataset, increasing discoverability of specific properties necessary to support the reuse of data for geologic characterization and site screening. The update to the NATCARB database saline layers will support current and future research as the USA moves to scale up GCS to meet climate change mitigation goals.