UTILIZING ONTOLOGY STRUCTURES TO CURATE THE DOE-NETL CARBON STORAGE OPEN DATABASE
With the large volume of publicly available data continuing to grow, it has become necessary to efficiently integrate, curate, and display data by enhancing existing workflows and developing new organizational capabilities. To support the scale of this data curation challenge, there has been an effort to create a hierarchical taxonomy for geologic concepts with rules and relations specific to carbon storage data, a structure typically referred to as an ontology. Through the development of ontological structures compatible with the Carbon Storage Open Database, it will be possible to translate the topical, temporal, and spatial metadata in a comprehensive manner compatible with programmatic and machine learning use cases.
Ensuring that an ontology structure maintains flexibility for future additions, supports interconnectivity with other specialized ontologies, and can reconcile the evolution of geologic lexicon without the risk of being reductive or exclusionary through simplification is critical. To balance the needs for flexibility, interconnectivity, and maintenance of dataset-specific lexicon, both a top-down and bottom-up approach to building an ontology are being explored. The specialized ontology for the Carbon Storage Open Database will enable more rapid assignment of appropriate symbology standards for visualization improvements, optimize topical and spatial tagging within keywords, and improve flexibility for utilization in existing data repositories such as EDX. This effort also aims to establish a foundation for utilization of ontologies for organization of other data related to geologic carbon storage in the future.