Title:
Interpretation, grounding and imagination for machine intelligence

Thumbnail Image
Author(s)
Vedantam, Shanmukha Ramak
Authors
Advisor(s)
Parikh, Devi
Advisor(s)
Person
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Supplementary to
Abstract
Understanding how to model computer vision and natural language jointly is a long-standing challenge in artificial intelligence. In this thesis, I study how modeling vision and language using semantic and pragmatic considerations can help derive more human-like inferences from machine learning models. Specifically, I consider three related problems: interpretation, grounding and imagination. In interpretation, the goal is to get machine learning models to understand an image and describe its contents using natural language in a contextually relevant manner. In grounding, I study how to connect natural language to referents in the physical world, and understand if this can help learn common sense. Finally, in imagination, I study how to ‘imagine’ visual concepts completely and accurately across the full range and (potentially unseen) compositions of their visual attributes. This thesis analyzes these problems from computational as well as algorithmic perspectives and suggests exciting directions for future work.
Sponsor
Date Issued
2018-11-08
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI