Title:
Attention-Enhanced Multimodal Learning with Projected Modalities

Thumbnail Image
Author(s)
Chaganti, Sidhartha
Authors
Advisor(s)
Ploetz, Thomas
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Supplementary to
Abstract
Multimodal learning enables networks to consider mulitple perspectives or modalities of a scene when performing activity recognition. We propose networks which use attention to better focus on select portions of data along both embedding-space and time. Additionally, we propose using a projection network to project from one modality to another in order to use a multimodal network even when the second modality is unavailable during inference time. We observe that adding attention leads to better performance and that using projected data retains most of the performance from the multimodal architectures.
Sponsor
Date Issued
2022-05
Extent
Resource Type
Text
Resource Subtype
Undergraduate Thesis
Rights Statement
Rights URI