Implementation:Google deepmind Dm control Suite Ball In Cup
| Metadata | Value |
|---|---|
| Implementation | Suite Ball In Cup |
| Domain | Reinforcement_Learning, Control |
| Source | Google_deepmind_Dm_control |
| Last Updated | 2026-02-15 04:00 GMT |
Overview
Concrete tool for the ball-in-cup catching task provided by the dm_control Control Suite.
Description
The Ball-in-Cup domain simulates a planar cup attached to an actuated arm with a ball connected to the cup by a string. The objective is to swing the ball upward so that it lands inside the cup. The domain defines a custom Physics subclass that provides two helper methods: ball_to_target, which computes the 2D vector from the ball to the target location (the cup), and in_target, which returns 1.0 if the ball is geometrically inside the target and 0.0 otherwise.
A single benchmark task, catch, is registered with the tags 'benchmarking' and 'easy'. The BallInCup task class initializes each episode by placing the ball at a random collision-free position (x in [-0.2, 0.2], z in [0.2, 0.5]) and returns observations containing joint positions and velocities. The reward is sparse: it returns 1 when the ball is fully inside the target cup and 0 otherwise.
The environment uses a default time limit of 20 seconds and a control timestep of 0.02 seconds.
Usage
Use this implementation for a classic sparse-reward manipulation benchmark. Load it via suite.load(domain_name='ball_in_cup', task_name='catch').
Code Reference
Source Location
- Repository: Google_deepmind_Dm_control
- File: dm_control/suite/ball_in_cup.py
- Lines: 1-96
Signature
# Task factory function
def catch(time_limit=20, random=None, environment_kwargs=None)
# Physics subclass
class Physics(mujoco.Physics):
def ball_to_target(self) # vector from ball to target (x, z)
def in_target(self) # 1 if ball is inside target, 0 otherwise
# Task class
class BallInCup(base.Task):
def initialize_episode(self, physics)
def get_observation(self, physics)
def get_reward(self, physics)
Import
from dm_control import suite
env = suite.load(domain_name='ball_in_cup', task_name='catch')
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
time_limit |
float | No | Maximum episode duration in seconds (default 20). |
random |
int, numpy.random.RandomState, or None | No | Random seed or RNG instance for reproducibility. |
environment_kwargs |
dict or None | No | Additional keyword arguments forwarded to the Environment constructor.
|
Outputs
| Name | Type | Description |
|---|---|---|
| environment | dm_control.rl.control.Environment |
A fully initialised environment conforming to the dm_env.Environment interface.
|
Observations
| Key | Type | Description |
|---|---|---|
position |
numpy array | Joint positions of the cup mechanism. |
velocity |
numpy array | Joint velocities of the cup mechanism. |
Usage Examples
from dm_control import suite
# Load the ball-in-cup catch task
env = suite.load(domain_name='ball_in_cup', task_name='catch')
# Run an episode
time_step = env.reset()
while not time_step.last():
action = env.action_spec().generate_value()
time_step = env.step(action)
print(f"Reward: {time_step.reward}")