Data-Centric Approaches for Exploiting Metainformation and Mitigating Model Regression to Aid Neural Networks

Author(s)
Logan, Yash-Yee K.
Editor(s)
Associated Organization(s)
Supplementary to:
Abstract
In this thesis, we explore data-centric approaches for exploiting metainformation and mitigating model regression to aid neural networks. We first focus on data-centric approaches for exploiting metainformation in multiple settings. Within a medical context, this can better automate the medical process and has potential to increase technology transfer from academic research to deployment in clinical practice. Numerous studies have demonstrated that deep learning is highly effective at analyzing medical imagery. In addition to imagery, studies have also shown that clinical records provide a vital data source for algorithms to perform diagnostic evaluations. We aid the training of neural networks for optical coherence tomography scans by incorporating metadata such as clinical data and biomarkers in unsupervised and supervised frameworks, and use patient ID to choose training samples in active learning frameworks. In a video active learning setting, exploiting metainformation from spatial and temporal features in a video provides insight on the time associated with annotating a video sequence. We identify sequences that take the shortest time to annotate, thus minimizing the cost associated with creating video dataset annotations for autonomous vehicle applications. Additionally, all neural network models are susceptible to catastrophic forgetting, which is the tendency of neural networks to forget previously learned information when they are trained on new or unrelated tasks. This results in a model's performance degrading or regressing from its current state as it is sequentially trained on additional data. This is safety-critical in several domains like medicine and autonomous systems. We design a suite of optimization functions that perform ``learn," ``retain," and ``restore" tasks to consistently alleviate model regression better than the existing benchmarks.
Sponsor
Date
2024-05-09
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI