Implementation:Google deepmind Dm control Suite Point Mass

Metadata	Value
Implementation	Suite Point Mass
Domain	Reinforcement_Learning, Control
Source	Google_deepmind_Dm_control
Last Updated	2026-02-15 04:00 GMT

Overview

Concrete tool for moving a 2D point mass to a target location provided by the dm_control Control Suite.

Description

The Point Mass domain models a simple 2D point mass that must navigate to a fixed target. The Physics subclass provides methods for computing the vector from the mass to the target in global coordinates (mass_to_target) and the scalar distance to the target (mass_to_target_dist).

Two tasks are registered: easy (benchmarking, tagged easy) and hard. Both use the PointMass task class parameterized by randomize_gains. In the easy variant (randomize_gains=False), the two control dimensions correspond directly to forces along the x and y axes. In the hard variant (randomize_gains=True), the actuator gain directions are randomized at the start of each episode, so each control input actuates a random linear combination of joints, making the task significantly harder. The hard task ensures the two randomized directions are not too parallel (dot product threshold of 0.9).

The reward combines proximity to the target (using a tolerance function bounded by the target size) with a small-control penalty term. Observations include the position and velocity of the point mass. The default time limit is 20 seconds.

Usage

Use this implementation for simple 2D navigation benchmarks or as an introductory RL task. Load via suite.load(domain_name='point_mass', task_name='easy') or suite.load(domain_name='point_mass', task_name='hard').

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/suite/point_mass.py
Lines: 1-126

Signature

# Task factory functions
def easy(time_limit=20, random=None, environment_kwargs=None)
def hard(time_limit=20, random=None, environment_kwargs=None)

# Physics subclass
class Physics(mujoco.Physics):
    def mass_to_target(self)        # vector from mass to target
    def mass_to_target_dist(self)   # scalar distance from mass to target

# Task class
class PointMass(base.Task):
    def __init__(self, randomize_gains, random=None)
    def initialize_episode(self, physics)
    def get_observation(self, physics)
    def get_reward(self, physics)

Import

from dm_control import suite

env = suite.load(domain_name='point_mass', task_name='easy')

I/O Contract

Inputs

Name	Type	Required	Description
`time_limit`	float	No	Maximum episode duration in seconds (default 20).
`random`	int, numpy.random.RandomState, or None	No	Random seed or RNG instance for reproducibility.
`environment_kwargs`	dict or None	No	Additional keyword arguments forwarded to the `Environment` constructor.

Outputs

Name	Type	Description
environment	`dm_control.rl.control.Environment`	A fully initialised environment conforming to the `dm_env.Environment` interface.

Observations

Key	Type	Description
`position`	numpy array	Generalized positions of the point mass.
`velocity`	numpy array	Generalized velocities of the point mass.

Usage Examples

from dm_control import suite

# Load the easy point mass task
env = suite.load(domain_name='point_mass', task_name='easy')

# Run an episode
time_step = env.reset()
while not time_step.last():
    action = env.action_spec().generate_value()
    time_step = env.step(action)

# Load the hard variant with randomized actuator gains
env_hard = suite.load(domain_name='point_mass', task_name='hard')

Related Pages

Principle:Google_deepmind_Dm_control_Control_Suite_Environment_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment