Title:
Scaling Online Reinforcement Learning In Embodied AI To 64K Steps
Scaling Online Reinforcement Learning In Embodied AI To 64K Steps
Author(s)
Elawady, Ahmad Ibrahem
Advisor(s)
Batra, Dhruv
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
Intelligent embodied agents need to quickly adapt to new scenarios by integrating long histories of experience into decision-making. For instance, a robot in an unfamiliar house initially wouldn't know the locations of objects needed for tasks and might perform inefficiently. However, as it gathers more experience, it should learn the layout of its environment and remember where objects are, allowing it to complete new tasks more efficiently. The current methods struggle with maintaining and utilizing long history consisting of thousands of observations. To enable such rapid adaptation to new tasks, we present ReLIC, a new approach for in-context reinforcement learning (RL) for embodied agents. With ReLIC, agents are capable of adapting to new environments using 64,000 steps of in-context experience with full attention mechanism while being trained through self-generated experience via RL. We achieve this by proposing a novel policy update scheme for on-policy RL called "partial updates'" as well as a Sink-KV mechanism which enables effective utilization of a long observation history for embodied agents. Our method outperforms a variety of meta-RL baselines in adapting to unseen houses in an embodied multi-object navigation task in a photorealistic simulation. In addition, we find that ReLIC is capable of few-shot imitation learning despite never being trained with expert demonstrations. We also provide a comprehensive analysis of ReLIC, highlighting that the combination of large-scale RL training, the proposed partial updates scheme, and the Sink-KV are essential for effective in-context learning.
Sponsor
Date Issued
2024-07-25
Extent
Resource Type
Text
Resource Subtype
Thesis