Principle:Haosulab ManiSkill Agent Controller Architecture
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, Control_Theory |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Every robot agent manages a hierarchy of controllers that translate high-level action vectors into joint-level drive targets, enabling uniform control interfaces across diverse robot morphologies and control modes.
Description
The Agent Controller Architecture principle defines the structural relationship between robot agents and their controllers. Each agent declares a dictionary of named controller configurations mapping control mode strings (e.g., "pd_joint_delta_pos", "pd_ee_delta_pose") to controller class/parameter pairs. When an environment is created with a specific control mode, the agent instantiates the corresponding controller hierarchy. Controllers are lazily instantiated only when their control mode is first activated, avoiding unnecessary computation for unused modes.
The architecture handles several concerns: (1) translating between the policy's action space and the physics engine's joint-level actuators, (2) normalizing actions to bounded ranges regardless of underlying joint limits, (3) composing multiple controllers for robots with heterogeneous joint groups (arm + gripper + base), and (4) supporting both joint-space and task-space control through a common interface.
Usage
This principle applies whenever:
- A robot implementation must declare which control modes it supports and how each mode maps to controller classes and joint subsets.
- Multiple controllers must be composed into a single unified action space for heterogeneous robots (e.g., arm joints under PD position control, gripper under mimic control, base under velocity control).
- The same robot must seamlessly switch between joint-level and end-effector-level control at environment creation time.
Theoretical Basis
Controller Configuration Map: Each agent subclass defines a _controller_configs property returning a dictionary from control mode names to ControllerConfig objects. A CombinedController flattens multiple sub-controllers into a single continuous action space.
Lazy Instantiation: Controllers are created only when their control mode is selected. This avoids loading IK solvers, building action spaces, or allocating buffers for unused modes.
Action Delegation: The agent's set_action() method delegates to the active controller, which decomposes the flat action vector into per-joint targets and applies them via the physics engine's drive interface.
Composability: A DictController manages named sub-controllers, while a CombinedController concatenates their action spaces into a single Box space. This enables control modes like "pd_ee_delta_pose" to compose an EE pose controller for the arm with a mimic position controller for the gripper.
Related Pages
- Implementation:Haosulab_ManiSkill_BaseAgent -- Base agent class managing controller lifecycle.
- Implementation:Haosulab_ManiSkill_BaseController -- Abstract controller base class and composite controllers.
- Implementation:Haosulab_ManiSkill_PDJointPosController -- PD joint position controller.
- Implementation:Haosulab_ManiSkill_PDJointVelController -- PD joint velocity controller.
- Implementation:Haosulab_ManiSkill_PDJointPosVelController -- Combined PD joint position and velocity controller.
- Implementation:Haosulab_ManiSkill_PDEEPoseController -- End-effector pose controller using IK.
- Implementation:Haosulab_ManiSkill_PDBaseVelController -- Ego-centric base velocity controller.
- Implementation:Haosulab_ManiSkill_PassiveController -- No-op controller for unactuated joints.
- Implementation:Haosulab_ManiSkill_MultiAgent -- Multi-agent wrapper composing multiple agent-controller pairs.
- Implementation:Haosulab_ManiSkill_Kinematics -- IK solver used by end-effector controllers.