Paper No. 213-7
Presentation Time: 3:15 PM
USING MACHINE LEARNING TO CLASSIFY LANDSCAPE DISTURBANCES IN PACIFIC NORTHWEST NATIONAL PARKS
Landscape disturbances drive ecological change by facilitating succession, maintain the overall health of ecosystems by cycling nutrients, and sometimes impact human infrastructure. To track disturbance through time and space, the National Park Service uses Landsat imagery and the LandTrendr algorithm to identify patches of land that have experienced disturbance in three parks in the Pacific Northwest. This study presents a process for training and tuning a random forest machine learning model that can automate labeling these patches in Mount Rainier National Park, Olympic National Park, and North Cascades National Park Service Complex with one of eight disturbance classes (annual variability, avalanche, blowdown, clearing, defoliation, fire, mass movement, and riparian change). A separate model was built for each park due to the unique nature of the different areas, and each model was trained on an extensive dataset of human-labeled disturbance patches from 1987-2023. We introduce a novel way of evaluating the tree voting outputs of the random forest models to allow the user to individually optimize the classification error for each disturbance class against a trade-off of having to manually label more or fewer disturbance patches. To perform this optimization, the models were tested using a withheld, labeled dataset. Preliminary random forest models showed that classifications made with a higher percentage of decision tree votes were more likely to be correct. After thresholding by decision tree vote cutoffs, 99.6% of the patches at Mount Rainier were labeled with 97.5% accuracy, 87.8% of the patches at Olympic were labeled with 95.7% accuracy, and 73.3% of the patches at North Cascades were labeled with 95.8% accuracy. We plan to further validate these results with an additional human review of a subset of the patches labeled by the model to estimate final classification error rates. This approach may be broadly applicable for assessing model performance on individual disturbance classes to balance high accuracy against the need to automate map classifications across large landscapes.