GSA 2020 Connects Online

Paper No. 103-7
Presentation Time: 7:15 PM

MACHINE-LEARNING TO PREDICT MANGANESE AND ARSENIC IN GROUNDWATER OF THE MISSISSIPPI RIVER VALLEY ALLUVIAL AQUIFER, CENTRAL UNITED STATES


KNIERIM, Katherine J., U.S. Geological Survey, Lower Mississippi-Gulf Water Science Center, 401 Hardin Road, Little Rock, AR 72211 and KINGSBURY, James A., U.S. Geological Survey, Lower Mississippi-Gulf Water Science Center, 640 Grassmere Park, Nashville, TN 37211

Machine-learning (ML) methods are being used to make predictions in hydrologic systems, including mapping predictions of groundwater quality across regional aquifers in areas that have not been sampled. Boosted regression trees (BRT), a type of ML, were used to predict manganese (Mn) and arsenic (As) in groundwater of the Mississippi River Valley alluvial aquifer (MRVA) within the south-central United States (US). In 2015, the MRVA was the most heavily used aquifer for irrigation in the US, supporting an almost $12 billion agricultural economy. High concentrations of trace elements in the MRVA are caused by geogenic sources in aquifer sediments and anoxic conditions. Both Mn and As occur at concentrations greater than drinking water standards, and some portions of the aquifer are used for domestic and public drinking water supply. BRT models were used to map Mn concentration (continuous model) and the probability of As concentration exceeding a 10 ug/L threshold (classification model), which is the maximum contaminant level for drinking water. Explanatory variables for BRT models included attributes associated with well construction (such as depth), surficial variables (such as soil texture), electical resistivity of aquifer materials, and output from a regional MODFLOW-2005 groundwater-flow model (such as groundwater flux). Continuous BRT models minimize root mean square error during model training, such that high and low Mn concentrations were not predicted well by the BRT model. Despite not reproducing the full range of observed Mn concentrations, BRT models captured the spatial heterogeneity of Mn concentrations observed in the MRVA. A relatively small As dataset (n =732) with a low proportion of threshold exceedances (11%) resulted in a model where accurate prediction of As exceeding 10 µg/L was only slightly better than assuming no high As concentrations in wells. The probability of exceeding 10 µg/L As tended to be highest along the Mississippi River. Mapped predictions of trace elements from ML methods was possible in the MRVA, but models would benefit from a greater number of samples to train models and higher resolution explanatory variables that characterize aquifer sediment heterogeneity and possible sources of trace elements.