Northeastern Section (45th Annual) and Southeastern Section (59th Annual) Joint Meeting (13-16 March 2010)

Paper No. 12
Presentation Time: 8:00 AM-12:05 PM

GEOLOGIC RANKING AS AN EXPLANATORY VARIABLE IN ESTIMATING THE PROBABILITY OF ELEVATED ARSENIC CONCENTRATIONS IN PENNSYLVANIA GROUNDWATER


GROSS, Eliza L., New Cumberland, PA 17055, egross@usgs.gov

Almost 2.2 million people in Pennsylvania drink groundwater from privately-owned wells not regularly tested for arsenic, a known carcinogen. Arsenic data are sparse because statewide testing to determine where groundwater concentrations exceed the health-based maximum contaminant level of 10 parts per billion (ppb) is prohibitively expensive. To reduce health risks, areas with potential for elevated arsenic concentrations can be defined. The purpose of this study is to utilize a dependent variable of arsenic concentrations and explanatory variable of geologic units ranked according to arsenic presence to determine areas with the highest predicted probability of elevated arsenic concentrations in groundwater, defined in this study as greater than or equal to 4 ppb.

Predicted probabilities for arsenic concentrations are calculated and mapped using a logistic regression model. The model uses a binary dependent variable of 5,023 water well arsenic concentrations and explanatory data extracted for each well using a Geographic Information System (GIS). The spatial dataset used as the explanatory variable is made up of ranked geologic units with arsenic presence rankings assigned according to professional opinion obtained through arsenic studies, arsenic concentrations in stream sediments, and geologic characteristics.

The logistic regression model identifies ranked geologic units as an influential variable in predicting the probability of elevated arsenic concentrations in areas of sparse data. The model results include a Standardized Estimate of 0.34 for geologic ranking, a max-rescaled R-Square of 0.09, and satisfactory model fit for the Hosmer and Lemeshow Goodness-of-Fit Test. Predicted probabilities of elevated arsenic concentrations range from 1 to 48 percent, and the resulting map shows a predicted probability of 35 to 48 percent in eastern and north central Pennsylvania. Pearson residuals were calculated and plotted, indicating clusters of poor predictions in southeast, southwest, and northwest Pennsylvania. Although this model provides a reasonable geologic portrayal of elevated arsenic concentrations, the predictive power of the model could be improved by including additional explanatory variables, such as geochemical parameters, land cover characteristics, or soils properties.