Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control Locomotion Tasks

From Leeroopedia
Metadata
Knowledge Sources dm_control
Domains Reinforcement Learning, Robotics, Locomotion
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete tool for defining locomotion objectives in the dm_control library, providing task classes for corridor running, target reaching, terrain escape, and maze navigation with configurable reward functions and termination conditions.

Description

The locomotion tasks module provides several task classes that combine a walker and an arena into a complete MDP with reward functions, episode initialization, and termination logic. Each task class extends Template:Code and implements the required lifecycle methods.

The available tasks include:

  • RunThroughCorridor: Rewards the walker for maintaining a target forward velocity while traversing a corridor. Terminates on disallowed contacts (non-foot geoms touching ground) or when end effectors drop below a height threshold.
  • GoToTarget: Rewards the walker for reaching a target position on a floor arena. Supports moving targets that relocate after the walker reaches them, and relative target positioning.
  • Escape: Rewards the walker for moving away from the center of a bowl-shaped terrain while remaining upright. Combines an escape distance reward with an upright reward. Initializes the walker at a random orientation and finds a non-contacting height.
  • ManyGoalsMaze: Places multiple target spheres at maze target positions and rewards the walker for collecting all of them. Terminates when all targets are activated or when the walker falls.

Usage

Use these tasks to define what the walker must accomplish in a given arena. RunThroughCorridor pairs with corridor arenas, GoToTarget pairs with Floor arenas, Escape pairs with Bowl arenas, and ManyGoalsMaze pairs with RandomMazeWithTargets arenas.

Code Reference

Source Location

Class File Lines
RunThroughCorridor Template:Code L31-158
GoToTarget Template:Code L33-217
Escape Template:Code L29-184
ManyGoalsMaze Template:Code L446-475

Signature

class RunThroughCorridor(composer.Task):
    def __init__(self, walker, arena,
                 walker_spawn_position=(0, 0, 0),
                 walker_spawn_rotation=None,
                 target_velocity=3.0,
                 contact_termination=True,
                 terminate_at_height=-0.5,
                 physics_timestep=0.005,
                 control_timestep=0.025):
        ...

class GoToTarget(composer.Task):
    def __init__(self, walker, arena,
                 moving_target=False,
                 target_relative=False,
                 target_relative_dist=1.5,
                 steps_before_moving_target=10,
                 distance_tolerance=0.5,
                 target_spawn_position=None,
                 walker_spawn_position=None,
                 walker_spawn_rotation=None,
                 physics_timestep=0.005,
                 control_timestep=0.025):
        ...

class Escape(composer.Task):
    def __init__(self, walker, arena,
                 walker_spawn_position=(0, 0, 0),
                 walker_spawn_rotation=None,
                 physics_timestep=0.005,
                 control_timestep=0.025):
        ...

class ManyGoalsMaze(ManyHeterogeneousGoalsMaze):
    def __init__(self, walker, maze_arena, target_builder,
                 target_reward_scale=1.0,
                 randomize_spawn_position=True,
                 randomize_spawn_rotation=True,
                 rotation_bias_factor=0,
                 aliveness_reward=0.0,
                 aliveness_threshold=-0.5,
                 contact_termination=True,
                 physics_timestep=0.001,
                 control_timestep=0.025):
        ...

Import

from dm_control.locomotion.tasks import corridors
from dm_control.locomotion.tasks import go_to_target
from dm_control.locomotion.tasks import escape
from dm_control.locomotion.tasks import random_goal_maze

I/O Contract

Inputs

Parameter Type Description
walker Walker instance The walker entity to control in this task.
arena Arena instance The arena providing the physical terrain.
target_velocity float (RunThroughCorridor) Target forward velocity in m/s. Default 3.0.
contact_termination bool Whether to terminate on disallowed ground contact. Default True.
terminate_at_height float (RunThroughCorridor) Height below which end effectors trigger termination. Default -0.5.
moving_target bool (GoToTarget) Whether the target relocates after being reached. Default False.
distance_tolerance float (GoToTarget) Distance threshold for considering the target reached. Default 0.5.
target_builder callable (ManyGoalsMaze) Factory function that creates target props (e.g., Template:Code).
target_reward_scale float (ManyGoalsMaze) Reward given when a target is collected. Default 1.0.
physics_timestep float Physics simulation timestep in seconds. Default 0.005.
control_timestep float Agent control timestep in seconds. Default 0.025.

Outputs (MDP Signals)

Method Return Type Description
get_reward(physics) float Scalar reward for the current step.
get_discount(physics) float Discount factor: 1.0 for continuing, 0.0 on termination.
should_terminate_episode(physics) bool Whether the episode should end.

Usage Examples

Corridor running task with a CMU Humanoid:

from dm_control.locomotion.walkers import cmu_humanoid
from dm_control.locomotion.arenas import corridors as corr_arenas
from dm_control.locomotion.tasks import corridors as corr_tasks
from dm_control import composer

walker = cmu_humanoid.CMUHumanoidPositionControlled(
    observable_options={'egocentric_camera': dict(enabled=True)})

arena = corr_arenas.GapsCorridor(
    platform_length=1.0, gap_length=1.5,
    corridor_width=10, corridor_length=100)

task = corr_tasks.RunThroughCorridor(
    walker=walker, arena=arena,
    walker_spawn_position=(0.5, 0, 0),
    target_velocity=3.0,
    physics_timestep=0.005, control_timestep=0.03)

env = composer.Environment(task=task, time_limit=30)

Go-to-target task with a moving target:

from dm_control.locomotion.walkers import cmu_humanoid
from dm_control.locomotion.arenas import floors
from dm_control.locomotion.tasks import go_to_target

walker = cmu_humanoid.CMUHumanoidPositionControlled()
arena = floors.Floor(size=(8, 8))

task = go_to_target.GoToTarget(
    walker=walker, arena=arena,
    moving_target=True,
    steps_before_moving_target=10,
    physics_timestep=0.005, control_timestep=0.03)

Maze foraging task with multiple targets:

import functools
from dm_control.locomotion.walkers import cmu_humanoid
from dm_control.locomotion.arenas import mazes
from dm_control.locomotion.tasks import random_goal_maze
from dm_control.locomotion.props import target_sphere

walker = cmu_humanoid.CMUHumanoidPositionControlled(
    observable_options={'egocentric_camera': dict(enabled=True)})

arena = mazes.RandomMazeWithTargets(
    x_cells=11, y_cells=11, xy_scale=3,
    max_rooms=4, room_min_size=4, room_max_size=5,
    spawns_per_room=1, targets_per_room=3)

task = random_goal_maze.ManyGoalsMaze(
    walker=walker, maze_arena=arena,
    target_builder=functools.partial(
        target_sphere.TargetSphere,
        radius=0.4, rgb1=(0, 0, 0.4), rgb2=(0, 0, 0.7)),
    target_reward_scale=50.0,
    physics_timestep=0.005, control_timestep=0.03)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment