Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Haosulab ManiSkill Controller Architecture

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Control_Theory
Last Updated 2026-02-15 08:00 GMT

Overview

A controller hierarchy translates high-level action commands (target positions, velocities, or end-effector poses) into low-level joint drive targets using proportional-derivative (PD) control laws, providing a uniform action interface across diverse robot morphologies.

Description

Robotics simulation requires a layer between the learning agent's action space and the physics engine's joint-level actuator interface. The Controller Architecture principle defines this layer as a hierarchy of controller classes. At the top sits a base controller abstraction that manages action space normalization, clipping, and the mapping between a flat action vector and the subset of joints each controller governs. Below it, specialized PD controllers implement specific control laws: joint position control (setting position drive targets), joint velocity control (setting velocity drive targets), combined position-velocity control, end-effector pose control (using inverse kinematics to convert Cartesian targets to joint targets), and base velocity control (for mobile robots). A passive controller variant applies no active control, allowing joints to be driven only by external forces or gravity.

This architecture solves several problems simultaneously. First, it enables the same robot to be operated under many different control modes simply by switching which controller configuration is active. Second, it normalizes action spaces so that policies can work with [-1, 1] bounded actions regardless of the underlying joint limits or force scales. Third, it handles the kinematic chain from Cartesian space to joint space when end-effector pose controllers are used, supporting both CPU (analytical) and GPU (batched) inverse kinematics backends. Fourth, it cleanly composes multiple controllers for robots with heterogeneous joint groups (e.g., an arm controller plus a gripper controller plus a base controller).

Usage

This principle applies whenever:

  • A robot must support multiple control modes (position, velocity, end-effector pose) and the user should be able to switch between them at environment creation time.
  • Action normalization is required so that reinforcement learning or imitation learning algorithms receive a bounded, well-scaled action space.
  • Inverse kinematics is needed to convert Cartesian end-effector commands into joint-level targets, supporting both CPU and GPU simulation backends.
  • Certain joints (e.g., passive wheels, unactuated fingers) must remain uncontrolled while other joints are actively driven.

Theoretical Basis

PD Control Law: The fundamental mechanism is the proportional-derivative control law applied at each actuated joint:

torque = Kp * (q_target - q_current) - Kd * q_dot_current

where Kp is the proportional (stiffness) gain, Kd is the derivative (damping) gain, q_target is the desired joint position, q_current is the current joint position, and q_dot_current is the current joint velocity. The physics engine (PhysX/SAPIEN) implements this as a drive: the controller sets the drive target and the engine computes and applies the resulting forces each simulation substep.

Position Control: Sets the position drive target directly. The action can be an absolute joint position or a delta relative to the current position. Delta mode is common in RL because the action space is centered around zero.

Velocity Control: Sets the velocity drive target. The position target is unused (or set to the current position). This is useful for tasks requiring continuous motion like drawing or wiping.

Combined Position-Velocity Control: Sets both position and velocity drive targets simultaneously, giving the policy control over both the desired configuration and the desired speed of approach.

End-Effector Pose Control: Accepts a Cartesian pose (position + orientation) for the end-effector and uses inverse kinematics (IK) to compute the corresponding joint positions. The IK solver can operate in absolute or delta mode, and supports both position-only and full 6-DOF pose targets. On GPU, a batched Jacobian-based IK solver runs in parallel across all environments.

Base Velocity Control: For mobile robots, converts ego-centric linear and angular velocity commands into joint-level drive targets for the base wheels/actuators, handling the transformation from the robot's local frame to the world frame.

Action Normalization: Controllers can normalize actions from [-1, 1] to the actual joint limits or a configured action range, and clip actions to valid bounds before applying them. This ensures that the learning algorithm sees a well-conditioned action space.

Controller Composition: Multiple controllers can be composed for different joint groups on the same robot. A "control mode" is a named combination of controllers (e.g., "pd_ee_delta_pose" might compose an EE pose controller for the arm with a mimic PD position controller for the gripper and a passive controller for unactuated joints).

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment