Development and Application of Data Fusion and Source Apportionment Methods over the Contiguous United States

Author(s)
Senthilkumar, Nirupama
Editor(s)
Associated Organization(s)
Supplementary to:
Abstract
Exposure to air pollution has been linked to numerous adverse health effects such as cardiovascular diseases, pulmonary diseases, cancer, and increased morbidity. Having accurate air quality exposure estimates are important to understanding the drivers of negative health outcomes. Air quality simulations and observational data are used as inputs in health analysis to estimate exposure to air pollution. However, observational data are limited spatially and temporally while air quality simulated data have biases associated. This dissertation presents multiple computational techniques to provide spatiotemporally accurate and complete air quality and source impacts fields for health analysis. A data fusion method along with a random forest technique is used to generate fused fields for particulate, gas, and trace metal species at a 12km resolution for the years 2005-2014. The data fusion method combines gridded simulations from the community multiscale air quality (CMAQ) model and point source observational data to create more accurate spatiotemporally complete air quality fields. The data fusion method creates high temporal correlations at observational locations for all species studied. The random forest approach uses land use variable information to correct spatial bias in annual average for fused field products. The data fusion and random forest method showed large improvements in spatial and temporal correlation for major particulate and gas species, and moderate improvements for trace metal pollutants. The fused field products were then used in a receptor model source apportionment analysis for particulate matter. A receptor model, chemical mass balance with gas constraints (CMBGC), was applied in each 12km fused field grid cell to generate spatiotemporally complete source impact fields for 10 particulate matter sources: gasoline vehicles, diesel vehicles, dust, biomass burning, coal combustion, ammonium sulfate, ammonium bisulfate, ammonium nitrate, secondary organic carbon, and sea salt. A CMBGC model was also applied to each 12km CMAQ grid cell to compare the improvements made in source impact fields from applying the data fusion and random forest correction. The comparison showed that data fusion was necessary to produce accurate source impact fields. The implications from this research show that data fusion can provide large improvements in air quality fields for health analysis. Fused fields are also able to provide spatiotemporally complete particulate matter source impact fields that match source impacts generated from observations. The daily data fused fields for 22 species and daily source impact fields are made available for future health and air quality analysis.
Sponsor
Date
2022-07-20
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI