IMPROVING SUPERVISED MACHINE LEARNING PREDICTIONS FOR HYDROTHERMAL FAVORABILITY BY SEPARATING SIGNALS IN ELEVATION DATA
Our previous work using the datasets from the Nevada Machine Learning Project demonstrated that a training strategy that randomly selects locations outside expert-delineated favorable structural settings as negative training sites creates a better supervised machine learning model to predict hydrothermal favorability than a training strategy that used expert-selected negative sites. These results suggest that expert-selected negative sites may not capture all the conditions in which hydrothermal systems fail to manifest. However, both training strategies predicted a west-east geographic trend with higher favorability generally in the west and lower favorability generally in the east. This west-east trend in predicted favorability is associated with elevation across the Great Basin, which trends higher from west to east and, in our previous work, is the feature with the largest impact on favorability predictions for the best performing model. To separate the mixed topographic signal, the original elevation feature is separated into a detrended elevation (i.e., local topography) feature and an elevation trend (i.e., regional topography) feature, allowing us to test which properties of topography are most important for prediction of hydrothermal systems.
Compared with using the original topography feature, supervised machine learning models that use the separated topographic signals consistently better predict known locations of hydrothermal systems. While the west still tends to have higher favorability, west-east prediction bias is lessened. This work emphasizes that carefully engineering features to represent distinct and relevant geological conditions is critical when applying machine learning to geoscience and predicting natural resources.