MIXTURE-MODEL CLUSTERING OF REGIONAL GEOCHEMICAL DATA

ELLEFSEN, Karl J., U.S. Geological Survey, Box 25046, Mail Stop 964, Denver Federal Center, Denver, CO 80225, SMITH, David B., U.S. Geological Survey, MS 973, Denver Federal Center, Denver, CO 80225 and HORTON, John D., U.S. Geological Survey, Denver Federal Center, MS 973, Denver, CO 80225, ellefsen@usgs.gov

Mixture-model clustering of regional geochemical data is a statistical procedure that is useful for interpretation. Because geochemical data are a type of compositional data, straightforward application of standard statistical procedures can yield erroneous results. Thus, we have developed and implemented (in the R statistical programming language) a robust clustering procedure that accounts for the compositional properties of the data: All element concentrations are first transformed with the isometric log-ratio transformation. The transformed concentrations are then used to calculate robust principal components. These components are clustered using a mixture model for which the probability density functions are multivariate normal, and the conditional probabilities that a sample is related to the density functions are calculated. In addition, random samples are drawn from each of the density functions and then are back-transformed to equivalent element concentrations.

The clustering procedure is evaluated with soil geochemical data from a survey of the state of Colorado (United States of America). The data comprise 959 samples with 31 element concentrations for each sample. The chosen mixture model has 4 density functions, and the calculated conditional probabilities partition the 959 samples into 4 clusters. For each cluster, most samples are spatially close together and thus are related to specific geologic features such as surficial deposits or bedrock. The independently-known geochemical properties of these geologic features are consistent with the random sample concentrations, and the order statistics for the random sample concentrations are almost identical to the corresponding order statistics for the field data (i.e., the measured concentrations for those samples with high conditional probabilities). Both results suggest that the clustering procedure is accurate. Another benefit of mixture-model clustering is that the element concentrations for each cluster are approximately statistically stationary, making them suitable for additional statistical processing such as multivariate kriging.

Session No. 391

T79. Geochemical Mapping at Regional to Continental Scales

Wednesday, 30 October 2013: 1:00 PM-5:00 PM

Room 601 (Colorado Convention Center)

Geological Society of America Abstracts with Programs. Vol. 45, No. 7, p.867

© Copyright 2013 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.

Back to: T79. Geochemical Mapping at Regional to Continental Scales

<< Previous Abstract | Next Abstract >>

The Geological Society of America 2013 GSA Annual Meeting in Denver: 125th Anniversary of GSA (27-30 October 2013) Denver, Colorado, USA

MIXTURE-MODEL CLUSTERING OF REGIONAL GEOCHEMICAL DATA

The Geological Society of America
2013 GSA Annual Meeting in Denver: 125th Anniversary of GSA (27-30 October 2013)
Denver, Colorado, USA