Mapping future land cover change over large areas of the United State using decision trees

Thumbnail Image
Sant'anna Dias, Felipe
Stieglitz, Marc
Turk, Greg
Aral, Mustafa M.
Associated Organization(s)
Supplementary to
Climate is one of the primary factors that control vegetation distribution and therefore it is expected that the effects of climate change will have a significant impact on the natural land cover. Numerous models, like Dynamic Global Vegetation Models (DGVMs), have been developed to project the potential shift in vegetation distribution under rapid climate change. However, those models present a great constraint on the amount of data that can be processed, making it unable to simulate vegetation distribution over large areas with an exceedingly high resolution. To overcome this limitation, new alternative methods have been proposed to study vegetation distribution and natural land cover classification using statistical techniques. Machine Learning is a scientific discipline that utilizes computer algorithms to learn patterns and statistical rules, based on present correlation defined by a training set, that can be applied to predict new information. Among different machine learning algorithms, the Decision Tree model has been widely used to classify present land cover and account land use modifications, making it a suitable model to statistically learn present vegetation distribution pattern, in order to be applied to predict future shifts in the biogeography with climate change. The decision tree algorithm applied in this work is the C5.0 classification tree, which provides classified images of future vegetation cover at four large sites in the US; a region including the Jemez and Santa Fe mountains located at north central New Mexico, a region of the Blue Mountains in Oregon, a region of the North Cascades located at the northwest Washington, and totality of the Wyoming State. The training data used to generate current vegetation cover include 2001 USGS Land Cover maps, 50 years of mean annual temperature and annual precipitation for the period 1950 – 2000, and Digital Elevation Model together with aspect and slope data. Future climate data was generated using Model E2 version of the Goddard Institute for Space Studies (GISS) General Circulation Model (GCMs) downscaled and bias corrected for the current climate data. Four future climate scenarios, RCP 2.6, RCP 4.5, RCP 6.0 and RCP 8.5, were used for generating the future climate for the target year of 2070 (average for 2061-2080 period). The model performed well for all four locations, achieving prediction accuracies for current land cover of 83%, 85%, 82% and 80% respectively for New Mexico, Oregon, Washington and Wyoming sites. Each site presented different types of modifications for future terrestrial ecosystems.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI