South-Central Section - 47th Annual Meeting (4-5 April 2013)

Paper No. 19-9
Presentation Time: 1:30 PM-5:30 PM

USING ARTIFICIAL NEURAL NETWORK AND MULTIPLE LOGISTIC REGRESSION MODELS TO PREDICT BACTERIA CONCENTRATIONS IN NATURAL WATERS


ALDRIDGE, Vaden J.1, GARZA, Daniel A.1, WARD, James W.2 and LOUDER, Jarrett3, (1)Physics and Geosciences, Angelo State University, ASU Station #10904, San Angelo, TX 76909-0904, (2)Physics and Geosciences, Angelo State University, ASU Station #10904, San Angelo, TX 76909, (3)The Institute of Environmental and Human Health, Texas Tech University, Lubbock, TX 79409-1163, valdridge@angelo.edu

This project’s primary focus is at predicting “safe for contact” or “unsafe for contact” levels of fecal coliform (FC) and/or Escherichia coli (E. coli) concentrations using multiple logistic regression (MLR) and the artificial neural network (ANN) models. Two separate data sets have been modeled using physiochemical data to determine prediction of bacterial concentrations. The first dataset from a karst aquifer in the central Bluegrass Region of Kentucky and the second being the Concho River system in the arid city limits of San Angelo in West Texas. Physiochemical parameters used in these models consisted of pH, electrical conductivity, water temperature and precipitation amount within 24 hours prior to sampling; while microbial parameters included E. coli and FC (concentrations in colony forming units (cfu)/100mL. The level of “unsafe for contact” bacterial counts were determined by applying the Environmental Protection Agency (EPA) guidelines for primary contact standards for both E. coli and FC, which were set as binary dependent variables in the models (e.g., for FC values “unsafe for contact” were set to 1 [i.e., > 200 cfu/100 mL] and values “safe for contact” were set as 0 [i.e., <200 cfu/100 mL]) with the remainder of parameters considered as independent variables. An MLR model using only physiochemical parameters correctly predicted “safe for contact” conditions 65.6 % of the time and “unsafe for contact” conditions 69.2 % of the time. ANN models are showing promise and are currently ongoing, yet results are expected to be similar or slightly better in predictability compared to the MLR models. Once working models of the existing datasets are finalized for each location it is hoped that these models will produce a long-term water quality evaluation tool allowing for determination of general water quality quickly by taking physiochemical data only and placing it into a model without performing costly microbial analysis.