Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control Suite Dog

From Leeroopedia
Knowledge Sources
Domains Reinforcement Learning, Robotics, Locomotion
Last Updated 2026-02-15 04:00 GMT

Overview

This module defines the Dog domain, a high-dimensional quadruped (dog) model with five locomotion and interaction tasks: stand, walk, trot, run, and fetch.

Description

The Dog domain provides a biologically-inspired canine model with a rich set of sensory observations and multiple task complexities. The module constructs the dog model from a base dog.xml MJCF file along with external mesh and texture assets from the dog_assets/ directory. The make_model() function configures the floor size and optionally removes the ball, target, and walls for locomotion-only tasks.

The Physics class extends mujoco.Physics with methods for measuring torso and pelvis height, body z-projections for uprightness, center-of-mass velocity in the torso frame, joint angles and velocities (filtering to hinge joints only), inertial sensor readings (accelerometer, velocimeter, gyro), touch and foot force sensors, and ball/target positions in the head frame for the fetch task.

Three task classes implement the reward structure. Stand rewards upright posture through a product of torso height, pelvis height, uprightness, and touch force factors. Move extends Stand with an additional forward velocity reward component. Fetch extends Stand with reach-ball (mouth-to-ball distance) and bring-to-target (ball-to-target distance) reward components. All tasks are registered with the no_reward_visualization tag.

Usage

Use this module when you need a high-dimensional locomotion benchmark with a realistic dog model. The five tasks span increasing difficulty from static standing to high-speed running and object interaction (fetch). Each task factory function returns a fully configured control.Environment ready for agent interaction.

Code Reference

Source Location

Signature

def make_model(floor_size, remove_ball):
    """Sets floor size, removes ball and walls (Stand and Move tasks)."""

def get_model_and_assets(floor_size=10, remove_ball=True):
    """Returns a tuple containing the model XML string and a dict of assets."""

def stand(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Stand task."""

def walk(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Walk task."""

def trot(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Trot task."""

def run(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Run task."""

def fetch(time_limit=_DEFAULT_TIME_LIMIT, random=None, environment_kwargs=None):
    """Returns the Fetch task."""

class Physics(mujoco.Physics):
    """Physics simulation with additional features for the Dog domain."""
    def torso_pelvis_height(self): ...
    def z_projection(self): ...
    def upright(self): ...
    def center_of_mass_velocity(self): ...
    def torso_com_velocity(self): ...
    def com_forward_velocity(self): ...
    def joint_angles(self): ...
    def joint_velocities(self): ...
    def inertial_sensors(self): ...
    def touch_sensors(self): ...
    def foot_forces(self): ...
    def ball_in_head_frame(self): ...
    def target_in_head_frame(self): ...
    def ball_to_mouth_distance(self): ...
    def ball_to_target_distance(self): ...

class Stand(base.Task):
    def __init__(self, random=None, observe_reward_factors=False): ...
    def initialize_episode(self, physics): ...
    def get_observation(self, physics): ...
    def get_reward(self, physics): ...

class Move(Stand):
    def __init__(self, move_speed, random, observe_reward_factors=False): ...
    def get_reward_factors(self, physics): ...

class Fetch(Stand):
    def __init__(self, random, observe_reward_factors=False): ...
    def initialize_episode(self, physics): ...
    def get_observation_components(self, physics): ...
    def get_reward_factors(self, physics): ...

Import

from dm_control.suite import dog

I/O Contract

Inputs

Name Type Required Description
time_limit float No Episode time limit in seconds (default 15)
random int or numpy.random.RandomState No Random seed or RNG instance
environment_kwargs dict No Additional keyword arguments for the Environment constructor
move_speed float Yes (Move) Target horizontal velocity for locomotion tasks (walk=1, trot=3, run=9)
observe_reward_factors bool No Whether to include factorized reward in observations (default False)

Outputs

Name Type Description
environment control.Environment A configured RL environment with the Dog physics and task
observations OrderedDict Contains joint_angles, joint_velocities, torso_pelvis_height, z_projection, torso_com_velocity, inertial_sensors, foot_forces, touch_sensors, actuator_state (plus ball_state and target_position for Fetch)
reward float Product of reward factors (torso height, pelvis height, uprightness, touch, and optionally forward velocity or fetch components)

Usage Examples

Basic Usage

from dm_control import suite

# Load the dog walk task
env = suite.load(domain_name='dog', task_name='walk')

# Step through the environment
time_step = env.reset()
action = env.action_spec().generate_value()
time_step = env.step(action)

# Access observations
joint_angles = time_step.observation['joint_angles']
torso_velocity = time_step.observation['torso_com_velocity']
reward = time_step.reward

Advanced Usage: Fetch Task

from dm_control import suite

# Load the fetch task (includes ball and target)
env = suite.load(domain_name='dog', task_name='fetch')

time_step = env.reset()
# Fetch task includes additional observations
ball_state = time_step.observation['ball_state']
target_position = time_step.observation['target_position']

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment