Capability-Aware Shared Hypernetworks for Heterogeneous Multi-Agent Coordination

Author(s)
Fu, Kevin
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
School of Computer Science
School established in 2007
Supplementary to:
Abstract
Cooperative heterogeneous multi-agent tasks require agents to behave in a flexible and complementary manner that best leverages their diverse capabilities. Learning-based approaches to this challenge span a spectrum between two endpoints: i) shared-parameter methods, which assign an ID to each agent to encode diverse behaviors within a single architecture for sample-efficiency, but are limited in their ability to learn diverse behaviors; ii) independent methods, which learn a separate policy for each agent, enabling greater diversity at the cost of sample- and parameter-efficiency. Prior work on learning for heterogeneous multi-agent teams has already explored the middle ground of this spectrum by learning shared-parameter or independent policies for classes of agents, allowing for a compromise between diversity and efficiency. However, these approaches still do not reason over the impact of agent capabilities on behavior, and thus cannot generalize to unseen agents or team compositions. In this work, we aim to enable flexible and heterogeneous coordination without sacrificing diversity, sample efficiency or generalization to unseen agents and teams. First, inspired by work from trait-based heterogeneous task allocation, we explore how capability-awareness enables generalization to unseen agents and teams. We thoroughly evaluate our GNN-based capability-aware policy architecture, showing that it can more effectively generalize than existing work. Then, inspired by recent work in transfer learning and meta-RL, we propose Capability-Aware Shared Hypernetworks (CASH), a new soft weight sharing architecture for heterogeneous coordination that use hypernetworks to explicitly reason about continuous agent capabilities in addition to local observations. Intuitively, CASH allows the team to learn shared decision making strategies (captured by a shared encoder) that are readily adapted according to the team’s individual and collective capabilities (by a shared hypernetwork). Our design is agnostic to the underlying learning paradigm. We conducted detailed experiments across two heterogeneous coordination tasks and three standard learning paradigms (imitation learning, value-based and policy-gradient reinforcement learning). Results reveal that CASH generates appropriately diverse behaviors that consistently outperform baseline architectures in terms of task performance and sample efficiency during both training and zero-shot generalization. Notably, CASH provides these improvements with only 20% to 40% of the learnable parameters used by baselines.
Sponsor
Date
2024-12-09
Extent
Resource Type
Text
Resource Subtype
Thesis
Rights Statement
Rights URI