DEVELOPMENT OF EMPIRICAL MODELS OF NATURAL METHANE OCCURRENCE IN SHALLOW GROUNDWATER OVERLYING THE MARCELLUS SHALE USING MACHINE LEARNING METHODS
We measured methane concentrations in 137 domestic wells in southern NY covering an area of 10,230 km2. For each well, we determined the topographic position (valley or upland), the geologic unit of water extraction, the chemical water type, and distances to the nearest fault, lineament, and active or other conventional gas well. Observed data from our study, similar to others, show significant correlations between methane concentrations and potential explanatory variables, which are linked to conceptual models of hydrogeologic processes driving methane occurrence (e.g. water type; landscape position). Our work further suggests empirical models predicting natural methane occurrence are most effective if they consider interaction among explanatory variables using machine learning methods, such as decision tree analysis, rather than simply assessing individual correlations between methane and isolated explanatory variables.
Combining methane and water quality data from this study and other prior studies in NY and Pennsylvania pre-HVHF (n=724), we found that although only 7.7% of domestic wells had >1 mg/L dissolved methane, 52% of valley wells producing Na-rich water had >1 mg/L dissolved methane. Our results suggest high methane concentrations in valley wells producing Na-rich water are likely to be naturally occurring, rather than the result of gas production by HVHF.