Implementation:Google deepmind Dm control Suite Manipulator
| Metadata | Value |
|---|---|
| Implementation | Suite Manipulator |
| Domain | Reinforcement_Learning, Control |
| Source | Google_deepmind_Dm_control |
| Last Updated | 2026-02-15 04:00 GMT |
Overview
Concrete tool for controlling a planar robotic arm to bring or insert objects provided by the dm_control Control Suite.
Description
The Manipulator domain models a planar robotic arm with 8 joints (root, shoulder, elbow, wrist, finger, fingertip, thumb, and thumbtip) that must grasp and manipulate objects. The domain supports two prop types (ball and peg) and two task modes (bring and insert). The make_model function dynamically generates the MJCF XML by selectively removing unused props and receptacles from the base model. For bring tasks, only the prop and its target are kept; for insert tasks, the corresponding receptacle (cup for ball, slot for peg) is also included.
The Physics subclass provides methods for reading bounded joint positions (as sin/cos pairs), joint velocities, 2D body poses (position and optional orientation), logarithmically scaled touch sensor signals from five contact sensors, and the Euclidean distance between named sites. The Bring task class handles all four task variants, parameterized by use_peg, insert, and fully_observable flags.
Four tasks are registered: bring_ball (benchmarking), bring_peg, insert_ball, and insert_peg (all tagged as hard). Episode initialization randomizes arm joint angles, target location, and object location. The object starts in the hand (10% probability), at the target (10% probability), or at a random location (80% probability). The peg reward combines grasping and bringing sub-rewards; the ball reward measures proximity to the target. All tasks use a control timestep of 0.01 seconds and a time limit of 10 seconds.
Usage
Use this implementation for challenging manipulation benchmarks involving grasping and placement. Load via suite.load(domain_name='manipulator', task_name='bring_ball') or any of the other registered task names.
Code Reference
Source Location
- Repository: Google_deepmind_Dm_control
- File: dm_control/suite/manipulator.py
- Lines: 1-288
Signature
# Task factory functions
def bring_ball(fully_observable=True, time_limit=10, random=None,
environment_kwargs=None)
def bring_peg(fully_observable=True, time_limit=10, random=None,
environment_kwargs=None)
def insert_ball(fully_observable=True, time_limit=10, random=None,
environment_kwargs=None)
def insert_peg(fully_observable=True, time_limit=10, random=None,
environment_kwargs=None)
# Model generation
def make_model(use_peg, insert)
# Physics subclass
class Physics(mujoco.Physics):
def bounded_joint_pos(self, joint_names) # joint positions as (sin, cos)
def joint_vel(self, joint_names) # joint velocities
def body_2d_pose(self, body_names, orientation=True) # 2D pose
def touch(self) # log-scaled touch sensors
def site_distance(self, site1, site2) # Euclidean distance between sites
# Task class
class Bring(base.Task):
def __init__(self, use_peg, insert, fully_observable, random=None)
def initialize_episode(self, physics)
def get_observation(self, physics)
def get_reward(self, physics)
Import
from dm_control import suite
env = suite.load(domain_name='manipulator', task_name='bring_ball')
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
fully_observable |
bool | No | Whether observations include object and target state (default True). |
time_limit |
float | No | Maximum episode duration in seconds (default 10). |
random |
int, numpy.random.RandomState, or None | No | Random seed or RNG instance for reproducibility. |
environment_kwargs |
dict or None | No | Additional keyword arguments forwarded to the Environment constructor.
|
Outputs
| Name | Type | Description |
|---|---|---|
| environment | dm_control.rl.control.Environment |
A fully initialised environment conforming to the dm_env.Environment interface.
|
Observations
| Key | Type | Description |
|---|---|---|
arm_pos |
numpy array (8, 2) | Arm joint positions as (sin, cos) pairs. |
arm_vel |
numpy array (8,) | Arm joint velocities. |
touch |
numpy array (5,) | Log-scaled signals from palm, finger, thumb, fingertip, and thumbtip sensors. |
hand_pos |
numpy array (4,) | Hand 2D pose with orientation (fully observable mode only). |
object_pos |
numpy array (4,) | Object 2D pose with orientation (fully observable mode only). |
object_vel |
numpy array (3,) | Object joint velocities (fully observable mode only). |
target_pos |
numpy array (4,) | Target 2D pose with orientation (fully observable mode only). |
Usage Examples
from dm_control import suite
# Load the bring ball task
env = suite.load(domain_name='manipulator', task_name='bring_ball')
# Run an episode
time_step = env.reset()
while not time_step.last():
action = env.action_spec().generate_value()
time_step = env.step(action)
# Load the insert peg task (hard)
env_insert = suite.load(domain_name='manipulator', task_name='insert_peg')
# Load with sensor-only observations (no object/target state)
env_partial = suite.load(
domain_name='manipulator',
task_name='bring_ball',
task_kwargs={'fully_observable': False}
)