2014 GSA Annual Meeting in Vancouver, British Columbia (19–22 October 2014)

Paper No. 127-14
Presentation Time: 12:15 PM

OPTIMIZING A REFERENCE SPECTRAL DATABASE FOR RAMAN MINERAL IDENTIFICATION


BARTHOLOMEW, Paul R., Biology & Environmental Sciences Department, University of New Haven, 300 Boston Post Rd., West Haven, CT 06516, DYAR, M. Darby, Dept. of Astronomy, Mount Holyoke College, South Hadley, MA 01075 and CAREY, C.J., School of Computer Science, University of Massachusetts at Amherst, 140 Governors Drive, Amherst, MA 01003

It is well known that the Raman spectra of minerals can be used to accomplish pattern-match mineral identification. Of the Raman spectral databases which could be used to enable pattern-match mineral ID the largest is the RRUFF database (RRUFF.info). However, a large collection of spectra does not automatically constitute a reference database that is effective for successful pattern-match identification. The caveats include: duplicate spectra slow down the pattern matching process and keep the user from seeing what distinct minerals also come close to matching; spectra with low S/N can cause spurious matches when their pattern of noise is similar to the pattern of noise in the spectrum of the unknown; matching within a solid solution series can result in false implied composition when the solid solution series is not uniformly represented; not all spectra are free from non-Raman artifacts; and not all spectra are from equally well characterized mineral samples.

A spectral database optimized for Raman mineral identification is being created via the following methods. The starting point is 8629 unoriented spectra representing 2216 mineral species downloaded from RRUFF along with metadata such as analytical conditions, sample identification status, chemical analysis, and so on. Each spectrum was numerically processed to extract maximum peak height and RMS continuum noise and therefore a S/N ratio. In addition, a cosine vector-similarity (VS) metric was calculated for every pair of spectra (distinct samples and/or distinct laser wavelength) from the same mineral species. The S/N data is being used to identify low S/N spectra that need to be excluded. The VS metric is being used in two ways. Low values of the metric are used to identify spectra that constitute duplicates which can be excluded. High values of the VS metric are being used to identify sample-level problems such as fluorescence, impure samples, metamictization, species mis-identification, and so on. Finally, work is under way to identify gaps in the coverage of rock-forming and economically important minerals including gaps in the coverage of geologically important solid-solution series. Additional samples and additional Raman spectra will be obtained in order to fill these gaps.