Principle:Google deepmind Dm control Robot Configuration

Metadata
Knowledge Sources	dm_control
Domains	Reinforcement Learning, Robotics Simulation, Robot Modelling
Last Updated	2026-02-15 00:00 GMT

Overview

Robot configuration is the principle of assembling a simulated robotic manipulator from composable arm and hand entities whose joints, actuators, sensors, and observables are declaratively defined and can be customised through observation settings.

Description

A manipulation task requires a robot: a multi-joint arm that positions the end effector and a hand (gripper) that interacts with objects. Rather than hard-coding the entire robot in a monolithic model, the composer entity pattern splits the robot into discrete, reusable components:

Arm entity -- encapsulates the kinematic chain from the base to the wrist, including joints, velocity actuators, torque sensors, and proprioceptive observables (joint positions, velocities, torques).
Hand entity -- encapsulates the gripper mechanism attached at the wrist site, including finger joints, finger actuators, grasp control, and hand-specific observables (finger positions, pinch-site pose).

Each entity is backed by an MJCF (MuJoCo XML) model file that describes the physical geometry, inertia, and constraints. At construction time, velocity actuators are programmatically added to each joint with physically-motivated torque and velocity limits derived from manufacturer datasheets.

The two entities are attached at the arm's wrist site, and the combined robot is placed in an arena at a configurable offset. Observation settings determine which observables are enabled and how they are processed (e.g. buffering, delay, corruptor functions).

A factory-function layer provides manipulation-specific defaults so that task modules can construct a robot in a single call without specifying every parameter.

Usage

Robot configuration is performed once per task instantiation. The factory functions are called inside each task's constructor (e.g. reach, lift, place) and produce arm/hand entities that are then wired into the arena and physics engine.

Theoretical Basis

The robot can be modelled as a serial kinematic chain:

Base -> Joint_1 -> Joint_2 -> ... -> Joint_6 -> Wrist_Site -> Hand -> Finger_1..3

For each joint j:
    actuator(j) = velocity_actuator(
        gain     = kv,
        ctrl     in [-max_velocity, +max_velocity],
        force    in [-max_torque, +max_torque]
    )

Observation vector:
    arm_obs  = [sin(q), cos(q)]  for each arm joint q  (bounded representation)
    hand_obs = [finger_positions, finger_velocities, pinch_site_pos, pinch_site_rmat]

Using sine/cosine pairs for unlimited rotary joints ensures that observations remain bounded, which is beneficial for neural network function approximators. Torque observations are projected onto the joint axis via an einsum operation to collapse the 3-axis MuJoCo torque sensor into a scalar per joint.

Related Pages

Implementation:Google_deepmind_Dm_control_Jaco_Arm_Hand

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment