Implementation:Google deepmind Dm control Suite Ball In Cup

Metadata	Value
Implementation	Suite Ball In Cup
Domain	Reinforcement_Learning, Control
Source	Google_deepmind_Dm_control
Last Updated	2026-02-15 04:00 GMT

Overview

Concrete tool for the ball-in-cup catching task provided by the dm_control Control Suite.

Description

The Ball-in-Cup domain simulates a planar cup attached to an actuated arm with a ball connected to the cup by a string. The objective is to swing the ball upward so that it lands inside the cup. The domain defines a custom Physics subclass that provides two helper methods: ball_to_target, which computes the 2D vector from the ball to the target location (the cup), and in_target, which returns 1.0 if the ball is geometrically inside the target and 0.0 otherwise.

A single benchmark task, catch, is registered with the tags 'benchmarking' and 'easy'. The BallInCup task class initializes each episode by placing the ball at a random collision-free position (x in [-0.2, 0.2], z in [0.2, 0.5]) and returns observations containing joint positions and velocities. The reward is sparse: it returns 1 when the ball is fully inside the target cup and 0 otherwise.

The environment uses a default time limit of 20 seconds and a control timestep of 0.02 seconds.

Usage

Use this implementation for a classic sparse-reward manipulation benchmark. Load it via suite.load(domain_name='ball_in_cup', task_name='catch').

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/suite/ball_in_cup.py
Lines: 1-96

Signature

# Task factory function
def catch(time_limit=20, random=None, environment_kwargs=None)

# Physics subclass
class Physics(mujoco.Physics):
    def ball_to_target(self)   # vector from ball to target (x, z)
    def in_target(self)        # 1 if ball is inside target, 0 otherwise

# Task class
class BallInCup(base.Task):
    def initialize_episode(self, physics)
    def get_observation(self, physics)
    def get_reward(self, physics)

Import

from dm_control import suite

env = suite.load(domain_name='ball_in_cup', task_name='catch')

I/O Contract

Inputs

Name	Type	Required	Description
`time_limit`	float	No	Maximum episode duration in seconds (default 20).
`random`	int, numpy.random.RandomState, or None	No	Random seed or RNG instance for reproducibility.
`environment_kwargs`	dict or None	No	Additional keyword arguments forwarded to the `Environment` constructor.

Outputs

Name	Type	Description
environment	`dm_control.rl.control.Environment`	A fully initialised environment conforming to the `dm_env.Environment` interface.

Observations

Key	Type	Description
`position`	numpy array	Joint positions of the cup mechanism.
`velocity`	numpy array	Joint velocities of the cup mechanism.

Usage Examples

from dm_control import suite

# Load the ball-in-cup catch task
env = suite.load(domain_name='ball_in_cup', task_name='catch')

# Run an episode
time_step = env.reset()
while not time_step.last():
    action = env.action_spec().generate_value()
    time_step = env.step(action)
    print(f"Reward: {time_step.reward}")

Related Pages

Principle:Google_deepmind_Dm_control_Control_Suite_Environment_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment