Enhancing Teamwork in Multi-Robot Systems: Embodied Intelligence via Model- and Data-Driven Approaches
Loading...
Author(s)
Seraj, Esmaeil
Advisor(s)
Editor(s)
Collections
Supplementary to:
Permanent Link
Abstract
High-performing human teams leverage intelligent and efficient communication and coordination strategies to collaboratively maximize their joint utility. Inspired by teaming behaviors among humans, I seek to develop computational methods for synthesizing intelligent communication and coordination strategies for collaborative multi-robot systems. I leverage both classical model-based control and planning approaches as well as data-driven methods such as Multi-Agent Reinforcement Learning (MARL) and Learning from Demonstration (LfD) to provide several contributions towards enabling emergent cooperative teaming behavior across robot teams. In my thesis, I first leverage model-based methods for coordinated control and planning under uncertainty for multi-robot systems to study and develop techniques for efficiently incorporating environment models in multi-robot planning and decision making. In these contributions, I design centralized and decentralized coordination frameworks, at the control-input and the high-level planning stages, which are informed by and have access to the model of the world. First, I develop an algorithm for human-centered coordinated control of multi-robot networked systems in safety-critical applications. I tackle the problems of enabling a robot team to reason about a coordinated coverage plan through active state estimation and providing probabilistic guarantees for performance. I then extended these methods to directly formulate and account for heterogeneity in robots' characteristics and capabilities. I design a hierarchical coordination framework, which enables a composite team of robots (i.e., including robots that can only sense and robots that can only manipulate the environment) to effectively collaborate on complex missions such as aerial wildfire fighting. Model-based approaches provide the ability to derive performance and stability guarantees. However, can be sensitive to the accuracy of the model and the quality of the heuristic algorithm. As such, I leverage data-driven and Machine Learning (ML) approaches, such as MARL, to provide several contributions towards learning emergent cooperative behaviors. I design a graph-based architecture to learn efficient and diverse communication models for coordinating cooperative heterogeneous teams. Finally, inspired by the theory of mind in humans' strategic decision-making model, I develop an iterative model to learn deep decision-rationalization for optimizing action selection in collaborative, decentralized teaming. In recent years, MARL has been predominantly used by researchers to optimize a reward signal and learning multi-robot tasks. Nevertheless, Reinforcement Learning (RL) generally suffers from key limitations such as the difficulty of designing an expressive and suitable reward function for complex tasks, and high sample complexity. As such, accurate models of human strategies and behaviors are increasingly important. Additionally, as multi-robot systems become increasingly prevalent in our communities and workplace, aligning the values motivating robot behaviors with human values is critical. LfD attempts to learn the correct behavior directly from expert-generated data demonstrations rather than a reward function. As such, in the last part of my work, I develop a multi-agent LfD framework to efficiently incorporate humans' domain-knowledge of teaming strategies for collaborative robot teams and directly learn team coordination policies from human teachers. To this end, I propose the MixTURE framework for human training of robot teams. MixTURE enables robot teams to learn a humans' preferred strategy to collaborate, while simultaneously learning end-to-end emergent communication for the robot team to efficiently coordinate their actions, without the need for human generated data. MixTURE benefits from the merits of LfD methods over RL while significantly alleviating the human demonstrator’s workload and time required to provide demonstrations, as well as increasing the SUS and overall collaboration performance of the robot team.
Sponsor
Date
2023-04-27
Extent
Resource Type
Text
Resource Subtype
Dissertation