Skill Chaining for Long Horizon Rearrangement

Author(s)
Harish, Abhinav Narayan
Editor(s)
Associated Organization(s)
Series
Supplementary to:
Abstract
The ability to perform complex behaviors like cooking a meal, tidying up a house, loading a dishwasher is a strongly desired attribute in an intelligent agent. One promising strategy to execute these complex behaviours is breaking them down into modules, referred to as skills which can tackle sub-components of the complex task. However, a naive composition of skills does not optimize the sequential execution of such skills. For instance, i) The current skill may hand-off to the task to the next skill at a state it wasn't trained on ii) It could perturb object required by later skills which could lead to failure of the task. In this thesis, we explore a strategy to better adapt skills to the complex behaviour to be executed. We refer to the approach as Skill-Reward-Fine-tuning (SRFT) - we adapt the later skills to the terminal states of the previous using the individual skill's reward. We demonstrate the merits of this approach on the task of object rearrangement grounded in a physical simulation environment. On the task of rearranging two objects this approach demonstrates upto 17% improvement over the performance of pre-trained skills. We also discuss the inability of SRFT to improve hierarchical performance in layouts where handoff-challenges may be absent. This motivates future work involving holistic measures to optimize skills based on metrics that that govern the overall task rather than relying on solely optimizing the skill's own objective.
Sponsor
Date
2023-07-31
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI