Implementation:Google deepmind Dm control Suite Manipulator

Metadata	Value
Implementation	Suite Manipulator
Domain	Reinforcement_Learning, Control
Source	Google_deepmind_Dm_control
Last Updated	2026-02-15 04:00 GMT

Overview

Concrete tool for controlling a planar robotic arm to bring or insert objects provided by the dm_control Control Suite.

Description

The Manipulator domain models a planar robotic arm with 8 joints (root, shoulder, elbow, wrist, finger, fingertip, thumb, and thumbtip) that must grasp and manipulate objects. The domain supports two prop types (ball and peg) and two task modes (bring and insert). The make_model function dynamically generates the MJCF XML by selectively removing unused props and receptacles from the base model. For bring tasks, only the prop and its target are kept; for insert tasks, the corresponding receptacle (cup for ball, slot for peg) is also included.

The Physics subclass provides methods for reading bounded joint positions (as sin/cos pairs), joint velocities, 2D body poses (position and optional orientation), logarithmically scaled touch sensor signals from five contact sensors, and the Euclidean distance between named sites. The Bring task class handles all four task variants, parameterized by use_peg, insert, and fully_observable flags.

Four tasks are registered: bring_ball (benchmarking), bring_peg, insert_ball, and insert_peg (all tagged as hard). Episode initialization randomizes arm joint angles, target location, and object location. The object starts in the hand (10% probability), at the target (10% probability), or at a random location (80% probability). The peg reward combines grasping and bringing sub-rewards; the ball reward measures proximity to the target. All tasks use a control timestep of 0.01 seconds and a time limit of 10 seconds.

Usage

Use this implementation for challenging manipulation benchmarks involving grasping and placement. Load via suite.load(domain_name='manipulator', task_name='bring_ball') or any of the other registered task names.

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/suite/manipulator.py
Lines: 1-288

Signature

# Task factory functions
def bring_ball(fully_observable=True, time_limit=10, random=None,
               environment_kwargs=None)
def bring_peg(fully_observable=True, time_limit=10, random=None,
              environment_kwargs=None)
def insert_ball(fully_observable=True, time_limit=10, random=None,
                environment_kwargs=None)
def insert_peg(fully_observable=True, time_limit=10, random=None,
               environment_kwargs=None)

# Model generation
def make_model(use_peg, insert)

# Physics subclass
class Physics(mujoco.Physics):
    def bounded_joint_pos(self, joint_names)  # joint positions as (sin, cos)
    def joint_vel(self, joint_names)          # joint velocities
    def body_2d_pose(self, body_names, orientation=True)  # 2D pose
    def touch(self)                            # log-scaled touch sensors
    def site_distance(self, site1, site2)      # Euclidean distance between sites

# Task class
class Bring(base.Task):
    def __init__(self, use_peg, insert, fully_observable, random=None)
    def initialize_episode(self, physics)
    def get_observation(self, physics)
    def get_reward(self, physics)

Import

from dm_control import suite

env = suite.load(domain_name='manipulator', task_name='bring_ball')

I/O Contract

Inputs

Name	Type	Required	Description
`fully_observable`	bool	No	Whether observations include object and target state (default True).
`time_limit`	float	No	Maximum episode duration in seconds (default 10).
`random`	int, numpy.random.RandomState, or None	No	Random seed or RNG instance for reproducibility.
`environment_kwargs`	dict or None	No	Additional keyword arguments forwarded to the `Environment` constructor.

Outputs

Name	Type	Description
environment	`dm_control.rl.control.Environment`	A fully initialised environment conforming to the `dm_env.Environment` interface.

Observations

Key	Type	Description
`arm_pos`	numpy array (8, 2)	Arm joint positions as (sin, cos) pairs.
`arm_vel`	numpy array (8,)	Arm joint velocities.
`touch`	numpy array (5,)	Log-scaled signals from palm, finger, thumb, fingertip, and thumbtip sensors.
`hand_pos`	numpy array (4,)	Hand 2D pose with orientation (fully observable mode only).
`object_pos`	numpy array (4,)	Object 2D pose with orientation (fully observable mode only).
`object_vel`	numpy array (3,)	Object joint velocities (fully observable mode only).
`target_pos`	numpy array (4,)	Target 2D pose with orientation (fully observable mode only).

Usage Examples

from dm_control import suite

# Load the bring ball task
env = suite.load(domain_name='manipulator', task_name='bring_ball')

# Run an episode
time_step = env.reset()
while not time_step.last():
    action = env.action_spec().generate_value()
    time_step = env.step(action)

# Load the insert peg task (hard)
env_insert = suite.load(domain_name='manipulator', task_name='insert_peg')

# Load with sensor-only observations (no object/target state)
env_partial = suite.load(
    domain_name='manipulator',
    task_name='bring_ball',
    task_kwargs={'fully_observable': False}
)

Related Pages

Principle:Google_deepmind_Dm_control_Control_Suite_Environment_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment