ENHANCING MULTIVARIATE STATISTICAL CHARACTERIZATION OF HYDROCHEMICAL GROUNDWATER DATA: A COMPARATIVE ANALYSIS OF CLUSTERING METHODS

ASANTE, Joseph¹, KREAMER, David K.¹ and CROSS, Chad L.², (1)Department of Geoscience, University of Nevada, Las Vegas, Nevada, 4505 Maryland Parkway, Las Vegas, NV 89154/4010, (2)Epidemiology & Biostatistics Concentration, School of Public Health, University of Nevada, Las Vegas, Nevada, 4505 Maryland Parkway, Las Vegas, NV 89154/3064, asantej@unlv.nevada.edu

Many studies have characterized hydrochemical data using graphical and multivariate statistical methods. Compared to graphical methods, multivariate clustering techniques and Principal Component Analysis are widely used by researchers because these multivariate methods can handle large datasets and limitless parameters. However, the procedures of clustering techniques and Principal Component Analysis can be subjective. This subjectivity questions the significance of the hydrochemical facies delineated using clustering methods and Principal Component Analysis, limits using the full potential of hydrochemical data, and reduces confidence in interpreting the hydrochemical facies to solve hydrologic problems.

We hypothesized that, using Multiple Discriminant Function Analysis and Cross-Tabulation, quantitative decisions can be made about the clustering technique to use for a hydrochemical dataset, number of hydrochemical facies that are significant, and effect of hydrochemical data transformation, analytical errors, and outliers on a clustering technique. The goal was to optimize cluster analytic characterization of hydrochemical dataset by integrating quantitative decisions in the cluster analysis. We quantitatively found that, the Hierarchical Clustering method, using within-groups linkage with squared Euclidean distance, was the best method for our hydrochemical data; six hydrochemical facies are significant for the hydrochemical dataset. Also, inappropriate data transformation significantly affected the delineation of the hydrochemical facies (Cramer’s V < 0.8). In addition, the hydrochemical facies delineated using all the data and after separately removing the hydrochemical outlier samples (7 %) and the samples with analytical errors (19 %) were found to be regionally similar (Cramer’s V > 0.8).

Session No. 16

T14. Water Resources of the Densely Populated Alluvial Valleys of the Western States—Processes

Thursday, 19 May 2011: 8:00 AM-12:00 PM

Willow (Riverwoods Conference Center)

Geological Society of America Abstracts with Programs. Vol. 43, No. 4, p.60

© Copyright 2011 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.

Back to: T14. Water Resources of the Densely Populated Alluvial Valleys of the Western States—Processes

<< Previous Abstract | Next Abstract >>

Rocky Mountain (63rd Annual) and Cordilleran (107th Annual) Joint Meeting (18–20 May 2011)

ENHANCING MULTIVARIATE STATISTICAL CHARACTERIZATION OF HYDROCHEMICAL GROUNDWATER DATA: A COMPARATIVE ANALYSIS OF CLUSTERING METHODS