Implementation:Google deepmind Dm control Suite Quadruped

Knowledge Sources	Google_deepmind_Dm_control
Domains	Reinforcement Learning, Robotics, Locomotion
Last Updated	2026-02-15 04:00 GMT

Overview

This module defines the Quadruped domain, an ant-like four-legged robot with four tasks: walk, run, escape (bowl-shaped terrain), and fetch (bring ball to target).

Description

The Quadruped domain provides a versatile four-legged robot model with tasks ranging from simple locomotion to terrain traversal and object interaction. The make_model() function procedurally configures the quadruped.xml MJCF model by adjusting floor size and selectively including or removing terrain heightfields, rangefinder sensors, walls, and the ball/target elements.

The Physics class extends mujoco.Physics with a comprehensive set of observation methods. These include torso upright dot-product, torso velocity in the local frame, egocentric state (hinge joint positions, velocities, and actuator states), toe positions in the torso frame, force/torque sensor readings (scaled with arcsinh), IMU data (gyro and accelerometer), rangefinder readings (scaled with tanh, with -1 mapped to 1 for no-intersection), ball state relative to the torso, and target position in the torso frame. Sensor names are cached for performance.

Four task classes implement distinct challenges. Move rewards upright posture multiplied by forward velocity. Escape generates procedural terrain using a sinusoidal bowl shape multiplied by random smooth bumps (via scipy ndimage.zoom), rewards distance from origin, and uses heightfield data uploaded to the GPU. Fetch rewards reaching the ball combined with bringing it to the target, modulated by uprightness. A helper function _find_non_contacting_height places the robot at a collision-free height.

Usage

Use this module for quadruped locomotion benchmarks with diverse task complexities. The walk and run tasks test forward locomotion at different speeds, escape tests terrain traversal on procedurally generated heightfields, and fetch tests object interaction. Each task factory returns a configured control.Environment.

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/suite/quadruped.py
Lines: 1-477

Signature

def make_model(floor_size=None, terrain=False, rangefinders=False,
               walls_and_ball=False):
    """Returns the model XML string."""

def walk(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Walk task."""

def run(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Run task."""

def escape(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Escape task."""

def fetch(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Fetch task."""

class Physics(mujoco.Physics):
    """Physics simulation with additional features for the Quadruped domain."""
    def torso_upright(self): ...
    def torso_velocity(self): ...
    def egocentric_state(self): ...
    def toe_positions(self): ...
    def force_torque(self): ...
    def imu(self): ...
    def rangefinder(self): ...
    def origin_distance(self): ...
    def origin(self): ...
    def ball_state(self): ...
    def target_position(self): ...
    def ball_to_target_distance(self): ...
    def self_to_ball_distance(self): ...

class Move(base.Task):
    def __init__(self, desired_speed, random=None): ...
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

class Escape(base.Task):
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

class Fetch(base.Task):
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

Import

from dm_control.suite import quadruped

I/O Contract

Inputs

Name	Type	Required	Description
time_limit	float	No	Episode time limit in seconds (default 20)
random	int or numpy.random.RandomState	No	Random seed or RNG instance
environment_kwargs	dict	No	Additional keyword arguments for the Environment constructor
desired_speed	float	Yes (Move)	Target horizontal velocity (walk=0.5, run=5)
floor_size	float	No	Size of the floor geom (default varies by task)
terrain	bool	No	Whether to include heightfield terrain (default False)
rangefinders	bool	No	Whether to include rangefinder sensors (default False)
walls_and_ball	bool	No	Whether to include walls, ball, and target (default False)

Outputs

Name	Type	Description
environment	control.Environment	A configured RL environment with the Quadruped physics and task
observations	OrderedDict	Contains egocentric_state, torso_velocity, torso_upright, imu, force_torque (plus origin/rangefinder for Escape, ball_state/target_position for Fetch)
reward	float	Product of upright reward and task-specific component (velocity, escape distance, or reach-and-fetch)

Usage Examples

Basic Usage

from dm_control import suite

# Load the quadruped walk task
env = suite.load(domain_name='quadruped', task_name='walk')

time_step = env.reset()
action = env.action_spec().generate_value()
time_step = env.step(action)

# Access observations
ego_state = time_step.observation['egocentric_state']
torso_vel = time_step.observation['torso_velocity']

Escape Task with Terrain

from dm_control import suite

# Load the escape task (includes procedural terrain and rangefinders)
env = suite.load(domain_name='quadruped', task_name='escape')

time_step = env.reset()
origin = time_step.observation['origin']
rangefinder = time_step.observation['rangefinder']

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment