Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Haosulab ManiSkill Agent Controller Architecture

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Control_Theory
Last Updated 2026-02-15 08:00 GMT

Overview

Every robot agent manages a hierarchy of controllers that translate high-level action vectors into joint-level drive targets, enabling uniform control interfaces across diverse robot morphologies and control modes.

Description

The Agent Controller Architecture principle defines the structural relationship between robot agents and their controllers. Each agent declares a dictionary of named controller configurations mapping control mode strings (e.g., "pd_joint_delta_pos", "pd_ee_delta_pose") to controller class/parameter pairs. When an environment is created with a specific control mode, the agent instantiates the corresponding controller hierarchy. Controllers are lazily instantiated only when their control mode is first activated, avoiding unnecessary computation for unused modes.

The architecture handles several concerns: (1) translating between the policy's action space and the physics engine's joint-level actuators, (2) normalizing actions to bounded ranges regardless of underlying joint limits, (3) composing multiple controllers for robots with heterogeneous joint groups (arm + gripper + base), and (4) supporting both joint-space and task-space control through a common interface.

Usage

This principle applies whenever:

  • A robot implementation must declare which control modes it supports and how each mode maps to controller classes and joint subsets.
  • Multiple controllers must be composed into a single unified action space for heterogeneous robots (e.g., arm joints under PD position control, gripper under mimic control, base under velocity control).
  • The same robot must seamlessly switch between joint-level and end-effector-level control at environment creation time.

Theoretical Basis

Controller Configuration Map: Each agent subclass defines a _controller_configs property returning a dictionary from control mode names to ControllerConfig objects. A CombinedController flattens multiple sub-controllers into a single continuous action space.

Lazy Instantiation: Controllers are created only when their control mode is selected. This avoids loading IK solvers, building action spaces, or allocating buffers for unused modes.

Action Delegation: The agent's set_action() method delegates to the active controller, which decomposes the flat action vector into per-joint targets and applies them via the physics engine's drive interface.

Composability: A DictController manages named sub-controllers, while a CombinedController concatenates their action spaces into a single Box space. This enables control modes like "pd_ee_delta_pose" to compose an EE pose controller for the arm with a mimic position controller for the gripper.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment