Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control Suite Quadruped

From Leeroopedia
Revision as of 12:44, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Google_deepmind_Dm_control_Suite_Quadruped.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Reinforcement Learning, Robotics, Locomotion
Last Updated 2026-02-15 04:00 GMT

Overview

This module defines the Quadruped domain, an ant-like four-legged robot with four tasks: walk, run, escape (bowl-shaped terrain), and fetch (bring ball to target).

Description

The Quadruped domain provides a versatile four-legged robot model with tasks ranging from simple locomotion to terrain traversal and object interaction. The make_model() function procedurally configures the quadruped.xml MJCF model by adjusting floor size and selectively including or removing terrain heightfields, rangefinder sensors, walls, and the ball/target elements.

The Physics class extends mujoco.Physics with a comprehensive set of observation methods. These include torso upright dot-product, torso velocity in the local frame, egocentric state (hinge joint positions, velocities, and actuator states), toe positions in the torso frame, force/torque sensor readings (scaled with arcsinh), IMU data (gyro and accelerometer), rangefinder readings (scaled with tanh, with -1 mapped to 1 for no-intersection), ball state relative to the torso, and target position in the torso frame. Sensor names are cached for performance.

Four task classes implement distinct challenges. Move rewards upright posture multiplied by forward velocity. Escape generates procedural terrain using a sinusoidal bowl shape multiplied by random smooth bumps (via scipy ndimage.zoom), rewards distance from origin, and uses heightfield data uploaded to the GPU. Fetch rewards reaching the ball combined with bringing it to the target, modulated by uprightness. A helper function _find_non_contacting_height places the robot at a collision-free height.

Usage

Use this module for quadruped locomotion benchmarks with diverse task complexities. The walk and run tasks test forward locomotion at different speeds, escape tests terrain traversal on procedurally generated heightfields, and fetch tests object interaction. Each task factory returns a configured control.Environment.

Code Reference

Source Location

Signature

def make_model(floor_size=None, terrain=False, rangefinders=False,
               walls_and_ball=False):
    """Returns the model XML string."""

def walk(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Walk task."""

def run(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Run task."""

def escape(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Escape task."""

def fetch(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Fetch task."""

class Physics(mujoco.Physics):
    """Physics simulation with additional features for the Quadruped domain."""
    def torso_upright(self): ...
    def torso_velocity(self): ...
    def egocentric_state(self): ...
    def toe_positions(self): ...
    def force_torque(self): ...
    def imu(self): ...
    def rangefinder(self): ...
    def origin_distance(self): ...
    def origin(self): ...
    def ball_state(self): ...
    def target_position(self): ...
    def ball_to_target_distance(self): ...
    def self_to_ball_distance(self): ...

class Move(base.Task):
    def __init__(self, desired_speed, random=None): ...
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

class Escape(base.Task):
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

class Fetch(base.Task):
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

Import

from dm_control.suite import quadruped

I/O Contract

Inputs

Name Type Required Description
time_limit float No Episode time limit in seconds (default 20)
random int or numpy.random.RandomState No Random seed or RNG instance
environment_kwargs dict No Additional keyword arguments for the Environment constructor
desired_speed float Yes (Move) Target horizontal velocity (walk=0.5, run=5)
floor_size float No Size of the floor geom (default varies by task)
terrain bool No Whether to include heightfield terrain (default False)
rangefinders bool No Whether to include rangefinder sensors (default False)
walls_and_ball bool No Whether to include walls, ball, and target (default False)

Outputs

Name Type Description
environment control.Environment A configured RL environment with the Quadruped physics and task
observations OrderedDict Contains egocentric_state, torso_velocity, torso_upright, imu, force_torque (plus origin/rangefinder for Escape, ball_state/target_position for Fetch)
reward float Product of upright reward and task-specific component (velocity, escape distance, or reach-and-fetch)

Usage Examples

Basic Usage

from dm_control import suite

# Load the quadruped walk task
env = suite.load(domain_name='quadruped', task_name='walk')

time_step = env.reset()
action = env.action_spec().generate_value()
time_step = env.step(action)

# Access observations
ego_state = time_step.observation['egocentric_state']
torso_vel = time_step.observation['torso_velocity']

Escape Task with Terrain

from dm_control import suite

# Load the escape task (includes procedural terrain and rangefinders)
env = suite.load(domain_name='quadruped', task_name='escape')

time_step = env.reset()
origin = time_step.observation['origin']
rangefinder = time_step.observation['rangefinder']

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment