APPLICATION OF MACHINE LEARNING TECHNIQUES TO A WELL MONITORING NETWORK
GWL data were collected via down-hole pressure transducers from 29 wells that monitor the Culebra groundwater as part of the WIPP well monitoring network. Weather data (daily high & low temperatures & precipitation) provided by NOAA & the Culebra groundwater data were averaged into weekly & monthly datasets. Four algorithms were used to develop predictive models: Linear Regression (LR), Artificial Neural Network (ANN), Support Vector Machine (SVM), & Random Forest (RF). Linear Regression is not strictly a machine learning algorithm, but it provides a baseline to gauge the effectiveness of other models.
Each algorithm was used to develop models for each well. The data for each well were randomly categorized; 70% as a training dataset & 30% as a test dataset. Ten-fold cross-validation was used to fit the hyperparameters of each model during the training phase. Models were then tested against partial data from the 30% test dataset. Model objectives were both weekly & monthly predictions. The mean absolute error, correlation coefficient, & root mean square error of the model results were evaluated to assess model accuracy.
All models produced accurate results for the monthly data at most wells with average R2 values for each method of ~0.8 out of 1.0. LR did not perform as well on the weekly data, failing badly at two wells (R2 values of -0.06 & -1.03x1019). The ANN, RF, & SVM methods worked better for weekly data than for monthly data. While the error range of the algorithms’ accuracy averaged across all wells did overlap, RF consistently produced the more accurate results for both weekly & monthly data & at most individual wells.
The key conclusion is that ANN, RF, & SVM can be used to predict GWL one week and one-month ahead in the Culebra; these algorithms can be applied to the Culebra to better understand how recharge works in this system.
SNL is managed and operated by NTESS under DOE NNSA contract DE-NA0003525