Principle:Haosulab ManiSkill Multi Agent Pattern
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Multi_Agent_Systems, Simulation |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
A multi-agent wrapper manages multiple independent robot agents as a single unified agent, exposing a dictionary-based action space and delegating all lifecycle methods to the individual sub-agents.
Description
The Multi Agent Pattern principle defines how ManiSkill supports environments with multiple cooperating robots. Rather than requiring task environments to manage multiple agents directly, a MultiAgent wrapper class inherits from BaseAgent and presents a unified interface. It stores sub-agents in both a list and a dictionary keyed by {uid}-{index}, exposes a Dict action space with one entry per agent, and delegates all core methods (set_action, reset, get_proprioception, before_simulation_step) to the appropriate sub-agent.
This pattern preserves the single-agent API that task environments and policies expect. A policy for a two-robot task receives a Dict observation and outputs a Dict action, while the MultiAgent wrapper handles routing actions to the correct robot's controllers. Sensor configurations from all sub-agents are aggregated with prefixed UIDs to avoid naming collisions.
Usage
This principle applies whenever:
- A task requires two or more robots cooperating (e.g., bimanual manipulation, handover tasks).
- The environment API must remain consistent with single-robot environments (same step/reset/observe interface).
- Each robot in the multi-agent setup may have different morphologies, control modes, or sensor configurations.
Theoretical Basis
Wrapper Pattern: MultiAgent wraps a list of BaseAgent instances, presenting the same interface as a single agent. The environment interacts with MultiAgent through the standard agent lifecycle without knowing multiple robots are involved.
Dict Action Space: The action space is a gymnasium.spaces.Dict where each key corresponds to one sub-agent. This allows policies to output separate action vectors for each robot.
UID-Based Routing: Each sub-agent is assigned a unique key {uid}-{index}. Actions, observations, and state are routed to/from the correct sub-agent using these keys.
Sensor Aggregation: Camera and tactile sensor configs from all sub-agents are collected with prefixed names (e.g., panda-0/wrist_cam) to avoid collisions.
Related Pages
- Implementation:Haosulab_ManiSkill_MultiAgent -- MultiAgent wrapper class.