Attention-Enhanced Multimodal Learning with Projected Modalities

(Georgia Institute of Technology, 2022-05) Chaganti, Sidhartha

Multimodal learning enables networks to consider mulitple perspectives or modalities of a scene when performing activity recognition. We propose networks which use attention to better focus on select portions of data along both embedding-space and time. Additionally, we propose using a projection network to project from one modality to another in order to use a multimodal network even when the second modality is unavailable during inference time. We observe that adding attention leads to better performance and that using projected data retains most of the performance from the multimodal architectures.

Georgia Tech Library

260 4th Street NW, Atlanta, GA 30332 +1 404.894.4500 Campus Map

General
Directory
Employment
Support/Give
Library Accessibility
Emergency Information

Legal
Legal & Privacy Information
Human Trafficking Notice
Title IX/Sexual Misconduct
Hazing Public Disclosure
Accessibility
Accountability

Organizational Unit:

Undergraduate Research Opportunities Program

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Georgia Tech Library

Organizational Unit: Undergraduate Research Opportunities Program

Permanent Link

Research Organization Registry ID

Description

Previous Names

Parent Organization

Parent Organization

Includes Organization(s)

ArchiveSpace Name Record

Filters

Author

Advisor

Date

Organization

Series

Resource Type

Resource Subtype

Has files

Record Type

Settings

Sort By

Results per page

Publication Search Results

Organizational Unit:

Undergraduate Research Opportunities Program