Title:
Attention-Enhanced Multimodal Learning with Projected Modalities
Attention-Enhanced Multimodal Learning with Projected Modalities
Author(s)
Chaganti, Sidhartha
Advisor(s)
Ploetz, Thomas
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Multimodal learning enables networks to consider mulitple perspectives or modalities of a scene when performing activity recognition. We propose networks which use attention to better focus on select portions of data along both embedding-space and time. Additionally, we propose using a projection network to project from one modality to another in order to use a multimodal network even when the second modality is unavailable during inference time. We observe that adding attention leads to better performance and that using projected data retains most of the performance from the multimodal architectures.
Sponsor
Date Issued
2022-05
Extent
Resource Type
Text
Resource Subtype
Undergraduate Thesis