GSA Connects 2023 Meeting in Pittsburgh, Pennsylvania

Paper No. 46-11
Presentation Time: 8:00 AM-5:30 PM

APPLICATION OF MACHINE LEARNING TECHNIQUES TO A WELL MONITORING NETWORK


JACKSON, Ryan, Sandia National Laboratories, 4100 National Parks Hwy, Carlsbad, NM 88220

Groundwater levels (GWL) in wells are difficult to predict, and accurate forecasts could help constrain the behavior of recharge. Here we test the hypothesis that machine learning algorithms can accurately predict one week & one-month GWL’s within the Culebra Member of the Rustler Formation around the Waste Isolation Pilot Plant (WIPP) near Carlsbad, New Mexico.

GWL data were collected via down-hole pressure transducers from 29 wells that monitor the Culebra groundwater as part of the WIPP well monitoring network. Weather data (daily high & low temperatures & precipitation) provided by NOAA & the Culebra groundwater data were averaged into weekly & monthly datasets. Four algorithms were used to develop predictive models: Linear Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), & Random Forest (RF). Linear Regression is not strictly a machine learning algorithm, but it provides a baseline to gauge the effectiveness of other models.

Each algorithm was used to develop models for each well. The data for each well were randomly categorized; 70% as a training dataset & 30% as a test dataset. Ten-fold cross-validation was used to fit the hyperparameters of each model during the training phase. Models were then tested against partial data from the 30% test dataset. Model objectives were both weekly & monthly predictions. The mean absolute error, correlation coefficient, & root mean square error of the model results were evaluated to assess model accuracy.

All models produced accurate results for the monthly data at most wells with average R2 values for each method of ~0.8 out of 1.0. LR did not perform as well on the weekly data, failing badly at two wells (R2 values of -0.06 & -1.03x1019). The ANN, RF, & SVM methods worked better for weekly data than for monthly data. While the error range of the algorithms’ accuracy averaged across all wells did overlap, RF consistently produced the more accurate results for both weekly & monthly data & at most individual wells.

The key conclusion is that ANN, RF, & SVM can be used to predict GWL one week and one-month ahead in the Culebra; these algorithms can be applied to the Culebra to better understand how recharge works in this system.

SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525