STATISTICAL MODELING OF ARSENIC IN GROUNDWATER TO SUPPORT HUMAN HEALTH STUDIES
Collaborating with epidemiologists under a grant from the U.S. Geological Survey Powell Center, results from this modeling study will be compared to human health outcomes associated with exposure to arsenic, including certain cancers and pre-term births and low birthweights. Further, a machine learning modeling technique, boosted regression trees (BRT), is being used to develop a new predictive model. Preliminary results indicate that the BRT model more accurately predicts the occurrence of elevated arsenic levels than the original model, which used logistic regression (LR) methods.
A second related study with the CDC includes varying parameters in the models to assess the potential impact of drought on arsenic concentrations. In both the LR and BRT models, average annual precipitation and groundwater recharge are important variables. This suggests that drought conditions may affect the likelihood of elevated arsenic concentrations. To evaluate this possibility, precipitation and recharge values representative of drought will be incorporated into the model. Changes in the probability of elevated arsenic levels will be assessed and compared with domestic well water use to estimate changes in the population exposed to elevated arsenic under drought conditions.