Large scale machine learning for geospatial problems in computational sustainability

Thumbnail Image
Robinson, David Caleb
Dilkina, Bistra
Associated Organization(s)
Supplementary to
The UN laid out 17 Sustainable Development Goals as part of the “The 2030 Agenda for Sustainable Development”. Each goal consists of broad targets - such as increasing the percentage of forested land (indicator 15.1.1) - for the world to work towards in order to achieve a more sustainable future. Part of achieving these goals involves determining how to actually measure the progress that is being made towards them. Measurements of progress are necessary in order to ensure accountability, determine where resources are most needed, and weigh the effectiveness of existing sustainability efforts. However, many of the targets/indicators - like “the percentage of forested land” - are prohibitively difficult to measure at global scales without algorithmic support. Tackling these, and other problems originating in pursuit of goals in sustainability, present unique challenges in machine learning. My dissertation addresses several such challenges, and explores how to enable large scale machine learning with geospatial data in a variety of application areas: (1) With remotely sensed imagery in land cover mapping and human population prediction. We use convolutional neural networks to tackle the land cover mapping problem, i.e. segmenting aerial imagery into different land cover classes (water, forests, fields, etc.). We develop: methods for improving the spatial generalization of land cover models, a human-in-the-loop active learning framework for fine-tuning such models to new areas, and a self-supervised training method for initializing such models when labeled data is scarce or unavailable. We use our models to generate the first 1m resolution land cover map covering the continental United States. We also predict human population density from satellite imagery using convolutional neural networks, then aggregate model features over administrative boundaries to learn zone level models for predicting population counts. (2) With human migration data. We develop a new loss function for training neural networks to predict human migrations between US counties. We further couple human migration and sea level rise models in a general framework for predicting population distributions under different future sea level rise scenarios. Our results highlight how sea level rise could have farther reaching indirect effects (through changing migration patterns) in addition to the direct effects along coastlines.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI