PRINCIPAL COMPONENT ANALYSIS OF GEOCHEMICAL DATA FROM A KARST AQUIFER IN SOUTHEAST KENTUCKY
PCA attempts to derive a linear combination of variables such that the maximum amount of data variation is explained using the smallest number of components (eigenvectors). Components with eigenvalues greater than one tend to explain significant variation and are therefore included in the final PCA. In this investigation, using SPSS as the software, I find two components that repeatedly arise with strong eigenvalues. For example, when considering spring discharge (Q), Ca, Mg, K, Na, Cl, NO3, SO4, and total alkalinity (Alk), PCA calculates that 55% of the variance can be explained by Q, Ca, Mg, K, NO3, and Alk. The second component, including Na, Cl, and SO4, explains 30.5% of the remaining variance. Comparing PCA results to time-series graphs of the original data, the first component corresponds to parameters that are directly proportional to discharge; they vary according to interaction time with bedrock, namely the dissolution of limestone by carbonic acid. The second component may be related to the entrainment of oil-field brines, a possibility considering that shallow sources of petroleum were exploited in Redmond Creek during the early 20th Century.