Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Haosulab ManiSkill Multi Agent Pattern

From Leeroopedia
Knowledge Sources
Domains Robotics, Multi_Agent_Systems, Simulation
Last Updated 2026-02-15 08:00 GMT

Overview

A multi-agent wrapper manages multiple independent robot agents as a single unified agent, exposing a dictionary-based action space and delegating all lifecycle methods to the individual sub-agents.

Description

The Multi Agent Pattern principle defines how ManiSkill supports environments with multiple cooperating robots. Rather than requiring task environments to manage multiple agents directly, a MultiAgent wrapper class inherits from BaseAgent and presents a unified interface. It stores sub-agents in both a list and a dictionary keyed by {uid}-{index}, exposes a Dict action space with one entry per agent, and delegates all core methods (set_action, reset, get_proprioception, before_simulation_step) to the appropriate sub-agent.

This pattern preserves the single-agent API that task environments and policies expect. A policy for a two-robot task receives a Dict observation and outputs a Dict action, while the MultiAgent wrapper handles routing actions to the correct robot's controllers. Sensor configurations from all sub-agents are aggregated with prefixed UIDs to avoid naming collisions.

Usage

This principle applies whenever:

  • A task requires two or more robots cooperating (e.g., bimanual manipulation, handover tasks).
  • The environment API must remain consistent with single-robot environments (same step/reset/observe interface).
  • Each robot in the multi-agent setup may have different morphologies, control modes, or sensor configurations.

Theoretical Basis

Wrapper Pattern: MultiAgent wraps a list of BaseAgent instances, presenting the same interface as a single agent. The environment interacts with MultiAgent through the standard agent lifecycle without knowing multiple robots are involved.

Dict Action Space: The action space is a gymnasium.spaces.Dict where each key corresponds to one sub-agent. This allows policies to output separate action vectors for each robot.

UID-Based Routing: Each sub-agent is assigned a unique key {uid}-{index}. Actions, observations, and state are routed to/from the correct sub-agent using these keys.

Sensor Aggregation: Camera and tactile sensor configs from all sub-agents are collected with prefixed names (e.g., panda-0/wrist_cam) to avoid collisions.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment