Organizational Unit:
School of Interactive Computing

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 1 of 1
  • Item
    Towards multi-modal AI systems with open-world cognition
    (Georgia Institute of Technology, 2023-04-30) Agrawal, Harsh
    A long-term goal in AI research is to build intelligent systems with 'open-world' cognition. When deployed in the wild, AI systems should generalize to novel concepts and instructions. Such an agent would need to perceive both familiar and unfamiliar concepts present in the environment, combine the capabilities of models trained on different modalities, and incrementally acquire new skills to continuously adapt to the evolving world. In this thesis, we look at how we can combine complementary multi-modal knowledge with suitable forms of reasoning to enable novel concept learning. In Part 1, we show that agents can infer unfamiliar concepts in the presence of other familiar concepts by combining multi-modal knowledge with deductive reasoning. Furthermore, agents can use newly inferred concepts to update their vocabulary of known concepts and infer additional novel concepts incrementally. In Part 2, we will look at how we can use task-dependent augmentations for improving robustness in unseen environments. In Part 3, we develop realistic tasks that require understanding novel concepts. We present a benchmark to evaluate the AI system's capability to describe novel objects present in an image. We also show how embodied agents can combine perception with common-sense knowledge to perform household chores like tidying up the house, without any explicit human instruction, even in the presence of unseen objects in unseen environments. Finally, in Part 4, we show that multi-modal knowledge stored in large pre-trained models can be used to teach agents new skills, allowing the agent to perform novel tasks with increasing difficulty in a zero-shot manner.