DIGITAL SOIL PARENT MATERIAL MAPPING: MODELED VERSUS MAPPED ACCURACY AND VARIABLE IMPORTANCE

Bateman McDonald, Jacob M.

Paper No. 158-3

Presentation Time: 5:55 PM

DIGITAL SOIL PARENT MATERIAL MAPPING: MODELED VERSUS MAPPED ACCURACY AND VARIABLE IMPORTANCE

BATEMAN MCDONALD, Jacob M., Lewis F. Rogers Institute for Environmental and Spatial Analysis, University of North Georgia, 3820 Mundy Mill Rd, Oakwood, GA 30566

This research used a single county's soil survey and the random forest classification algorithm to predict the distribution of soil parent material in the Southern Blue Ridge Mountains of western North Carolina. Three training set selection techniques (area-dependent, equal sample, and a hybrid approach) were used to randomly select points from the different soil parent material types (e.g., alluvium, colluvium, residuum). A large number of land surface characteristics were attributed to each of the training set points. The land surface characteristics that were used included standard first- and second-order derivatives of a DEM (e.g., slope and curvature), as well as variables that describe landscape position relative to ridges, hillslopes, and bottomlands. To determine the variables that best describe each parent material type, an iterative variable reduction process of was used to create a series of random forest models using a progressively reduced (decorrelated) predictor set. While the random forest modeling results suggest high producer and user accuracies, mapped predictions were only able to provide highly accurate predictions for soil parent material classes that were best represented by their training set. The poor prediction of either minority or majority classes was due to the overlap in variable space between some classes (e.g., residuum and old alluvium are in similar landscape positions). The hybrid training set provided the best model and mapped accuracies but the overlap in variable space continued to be a source of error in the mapped predictions. The results of this analysis question the use of the initial variable importance measures to determine the absolute importance of variables in random forest models. Alternatively, this research suggests using an iterative method in which correlated variables are iteratively discarded from the predictor set so that ‘true’ variable importance can be determined.

Session No. 158

D29. Quaternary Geology II

Wednesday, 28 October 2020: 5:30 PM-8:00 PM

Geological Society of America Abstracts with Programs. Vol 52, No. 6
doi: 10.1130/abs/2020AM-358348

© Copyright 2020 The Geological Society of America (GSA), all rights reserved. Permission is hereby granted to the author(s) of this abstract to reproduce and distribute it freely, for noncommercial purposes. Permission is hereby granted to any individual scientist to download a single copy of this electronic file and reproduce up to 20 paper copies for noncommercial purposes advancing science and education, including classroom use, providing all reproductions include the complete content shown here, including the author information. All other forms of reproduction and/or transmittal are prohibited without written permission from GSA Copyright Permissions.

Back to: D29. Quaternary Geology II

<< Previous Abstract | Next Abstract >>

GSA 2020 Connects Online

DIGITAL SOIL PARENT MATERIAL MAPPING: MODELED VERSUS MAPPED ACCURACY AND VARIABLE IMPORTANCE