Principle:Haosulab ManiSkill Robot Agent Definition
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, Agent_Architecture |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
A robot agent definition is a declarative specification that binds a physical robot model to its control interface, default poses, and sensor configuration so that a simulation environment can instantiate and operate the robot without knowledge of its mechanical details.
Description
In robotics simulation, the environment must interact with many different robots -- manipulators, mobile bases, humanoids, quadrupeds, dexterous hands -- through a uniform interface. The Robot Agent Definition principle addresses this by establishing a base class contract that every robot must fulfill. Each agent declares: (1) the path to its kinematic and dynamic model (URDF or MJCF), (2) the set of available controller configurations mapping human-readable control mode names to controller class/parameter pairs, (3) one or more keyframes representing canonical poses (e.g., "rest" or "home" configurations with joint positions and base placement), and (4) optional sensor configurations for cameras or tactile sensors mounted on the robot body.
This principle solves the problem of robot interchangeability. A task environment written for "a gripper robot" can accept any agent that provides the expected interface -- a Panda, a Fetch, an xArm, or a WidowX -- without modification. The agent definition layer also handles practical concerns such as automatic asset downloading when a model file is not found locally, gravity compensation on robot links, mimic joint controllers for coupled finger joints, and proprioceptive state extraction (joint positions, velocities, and TCP pose).
The pattern follows the Template Method design: the base class defines the lifecycle (load, reset, step, get_proprioception) while each concrete robot subclass fills in the declarative attributes and optionally overrides helper methods like is_grasping or is_static.
Usage
This principle applies whenever a new physical robot needs to be introduced into the simulation. It is the correct design choice when:
- A new manipulator, hand, humanoid, or quadruped must be made available to existing task environments without modifying those environments.
- Multiple control modes (joint position, joint velocity, end-effector pose, base velocity) need to be offered for the same robot through a single configuration dictionary.
- Predefined keyframe poses are needed for deterministic episode resets or motion planning start configurations.
- On-body sensors (wrist cameras, tactile arrays) must be declared as part of the robot specification rather than the environment specification.
Theoretical Basis
The Robot Agent Definition follows a declarative specification plus lifecycle protocol pattern:
1. Model Declaration: Each agent specifies a file path to a URDF (Unified Robot Description Format) or MJCF (MuJoCo XML Format) file. This file encodes the kinematic tree (links, joints, joint limits), dynamic parameters (masses, inertias), collision geometry, and visual meshes. The simulation engine parses this file to instantiate a fully articulated rigid-body system.
2. Controller Configuration Map: The agent provides a dictionary mapping string names to controller configurations. Each configuration specifies the controller class (e.g., PD joint position, PD end-effector pose, PD base velocity), the joints it controls, PD gains (stiffness and damping), force limits, action scaling, and normalization bounds. Multiple configurations can be composed for different joint groups (arm, gripper, base) and combined into named control modes.
3. Keyframe Specification: A keyframe is a tuple of (base_pose, joint_positions, joint_velocities) representing a known-good robot state. Keyframes serve as reset targets and motion planning start states. They decouple the notion of a "home position" from the task environment.
4. Sensor Declaration: Agents can declare sensor configurations (camera intrinsics, mount links, depth parameters) that travel with the robot. This ensures that robot-mounted sensors are consistently configured regardless of which task environment the robot is placed in.
5. Lifecycle Protocol:
- Load: Parse the model file, build the articulation, instantiate controllers, apply material properties.
- Reset: Set joints to a keyframe configuration, zero velocities, reset controller state.
- Step: Accept an action vector, delegate to the active controller, apply drive targets.
- Observe: Return proprioceptive state (joint positions, velocities, end-effector pose).
This design enables O(1) integration of new robots: define the class attributes, and the robot is immediately usable in any compatible environment.
Related Pages
- Implementation:Haosulab_ManiSkill_BaseAgent -- The abstract base class all robot agents inherit from.
- Implementation:Haosulab_ManiSkill_Panda -- Franka Emika Panda 7-DOF arm with parallel-jaw gripper.
- Implementation:Haosulab_ManiSkill_PandaStick -- Panda variant with a stick end-effector instead of gripper.
- Implementation:Haosulab_ManiSkill_PandaWristCam -- Panda variant with wrist-mounted RealSense camera.
- Implementation:Haosulab_ManiSkill_Fetch -- Fetch mobile manipulator with base, torso, arm, and gripper.
- Implementation:Haosulab_ManiSkill_AllegroHand -- Allegro 16-DOF dexterous hand (left and right variants).
- Implementation:Haosulab_ManiSkill_AllegroHandTouch -- Allegro hand variant with tactile touch sensors.
- Implementation:Haosulab_ManiSkill_FloatingPandaGripper -- Floating Panda gripper with 6-DOF base control.
- Implementation:Haosulab_ManiSkill_FloatingRobotiq2F85Gripper -- Floating Robotiq 2F-85 gripper with 6-DOF base control.
- Implementation:Haosulab_ManiSkill_FloatingAbilityHand -- Floating PSYONIC Ability Hand with 6-DOF base control.
- Implementation:Haosulab_ManiSkill_FloatingInspireHand -- Floating Inspire Hand with 6-DOF base control.
- Implementation:Haosulab_ManiSkill_FixedInspireHand -- Fixed-base Inspire Hand (left and right variants).
- Implementation:Haosulab_ManiSkill_GoogleRobot -- Google Robot mobile manipulator.
- Implementation:Haosulab_ManiSkill_DClaw -- D'Claw tri-finger gripper robot.
- Implementation:Haosulab_ManiSkill_TriFingerPro -- TriFinger Pro 9-DOF tri-finger robot.
- Implementation:Haosulab_ManiSkill_ANYmalC -- ANYmal C quadruped robot.
- Implementation:Haosulab_ManiSkill_UnitreeGo2 -- Unitree Go2 quadruped robot.
- Implementation:Haosulab_ManiSkill_UnitreeH1 -- Unitree H1 humanoid robot.
- Implementation:Haosulab_ManiSkill_UnitreeH1WithHands -- Unitree H1 with dexterous hands (upper body).
- Implementation:Haosulab_ManiSkill_UnitreeG1 -- Unitree G1 humanoid robot.
- Implementation:Haosulab_ManiSkill_UnitreeG1UpperBody -- Unitree G1 upper body variant.
- Implementation:Haosulab_ManiSkill_Humanoid -- Classic MuJoCo humanoid robot.
- Implementation:Haosulab_ManiSkill_Stompy -- K-Scale Labs Stompy humanoid robot.
- Implementation:Haosulab_ManiSkill_Koch -- Koch v1.1 low-cost 6-DOF robot arm.
- Implementation:Haosulab_ManiSkill_SO100 -- SO-100 low-cost 6-DOF robot arm.
- Implementation:Haosulab_ManiSkill_WidowX250S -- Interbotix WidowX 250S robot arm.
- Implementation:Haosulab_ManiSkill_WidowXAI -- WidowX AI robot arm.
- Implementation:Haosulab_ManiSkill_WidowXAIWristCam -- WidowX AI with wrist camera.
- Implementation:Haosulab_ManiSkill_UR10e -- Universal Robots UR-10e industrial arm.
- Implementation:Haosulab_ManiSkill_XArm6NoGripper -- xArm6 without gripper (and wrist camera variant).
- Implementation:Haosulab_ManiSkill_XArm6Robotiq -- xArm6 with Robotiq 2F-85 gripper.
- Implementation:Haosulab_ManiSkill_XArm7Ability -- xArm7 with PSYONIC Ability Hand.
- Implementation:Haosulab_ManiSkill_XLeRobot -- XLeRobot universal LeRobot-compatible agent.
- Implementation:Haosulab_ManiSkill_MultiAgent -- Multi-agent wrapper managing multiple BaseAgent instances.
- Implementation:Haosulab_ManiSkill_AgentRegistration -- Agent UID registration system.
- Implementation:Haosulab_ManiSkill_BaseRealAgent_LeRobotAgent -- Base class for real-world robot agents.