Implementation:Google deepmind Dm control Suite Humanoid

Metadata	Value
Implementation	Suite Humanoid
Domain	Reinforcement_Learning, Control
Source	Google_deepmind_Dm_control
Last Updated	2026-02-15 04:00 GMT

Overview

Concrete tool for controlling a 3D humanoid to stand, walk, or run provided by the dm_control Control Suite.

Description

The Humanoid domain models a full 3D humanoid body with 21 degrees of freedom and multiple actuated joints. The Physics subclass provides methods for querying the torso uprightness (z-axis projection), head height, center-of-mass position and velocity, torso vertical orientation, joint angles (excluding the 7 root DOFs of the free joint), and end effector (hands and feet) positions in the egocentric torso frame.

Four tasks are registered: three benchmarking tasks (stand, walk, run) and one extra task (run_pure_state). All use the Humanoid task class parameterized by move_speed and pure_state. The stand task sets move_speed=0 and rewards the agent for keeping the head above 1.4 m, maintaining torso uprightness, using small controls, and not moving horizontally. The walk and run tasks set move_speed to 1 and 10 m/s respectively, rewarding horizontal locomotion at or above the target speed. The run_pure_state variant provides raw MuJoCo positions and velocities as observations instead of egocentric features.

Episode initialization randomizes all limited and rotational joints to a collision-free configuration. All tasks use a default time limit of 25 seconds and a control timestep of 0.025 seconds.

Usage

Use this implementation for high-dimensional 3D locomotion benchmarks. Load via suite.load(domain_name='humanoid', task_name='stand') or the walk/run variants.

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/suite/humanoid.py
Lines: 1-207

Signature

# Task factory functions
def stand(time_limit=25, random=None, environment_kwargs=None)
def walk(time_limit=25, random=None, environment_kwargs=None)
def run(time_limit=25, random=None, environment_kwargs=None)
def run_pure_state(time_limit=25, random=None, environment_kwargs=None)

# Physics subclass
class Physics(mujoco.Physics):
    def torso_upright(self)                # z-axis projection of torso
    def head_height(self)                  # height of the head
    def center_of_mass_position(self)      # CoM position
    def center_of_mass_velocity(self)      # CoM velocity from sensor
    def torso_vertical_orientation(self)   # z-projection of torso orientation
    def joint_angles(self)                 # joint positions sans root DOFs
    def extremities(self)                  # hand/foot positions in torso frame

# Task class
class Humanoid(base.Task):
    def __init__(self, move_speed, pure_state, random=None)
    def initialize_episode(self, physics)
    def get_observation(self, physics)
    def get_reward(self, physics)

Import

from dm_control import suite

env = suite.load(domain_name='humanoid', task_name='walk')

I/O Contract

Inputs

Name	Type	Required	Description
`time_limit`	float	No	Maximum episode duration in seconds (default 25).
`random`	int, numpy.random.RandomState, or None	No	Random seed or RNG instance for reproducibility.
`environment_kwargs`	dict or None	No	Additional keyword arguments forwarded to the `Environment` constructor.

Outputs

Name	Type	Description
environment	`dm_control.rl.control.Environment`	A fully initialised environment conforming to the `dm_env.Environment` interface.

Observations (egocentric mode)

Key	Type	Description
`joint_angles`	numpy array	Joint positions excluding the 7 root free-joint DOFs.
`head_height`	float	Height of the head above the ground.
`extremities`	numpy array (12,)	Left/right hand and foot positions in torso frame.
`torso_vertical`	numpy array (3,)	Z-projection of torso orientation matrix.
`com_velocity`	numpy array (3,)	Center-of-mass velocity from sensor.
`velocity`	numpy array	Joint velocities.

Observations (pure state mode)

Key	Type	Description
`position`	numpy array	Full generalized positions.
`velocity`	numpy array	Full generalized velocities.

Usage Examples

from dm_control import suite

# Load the walk task
env = suite.load(domain_name='humanoid', task_name='walk')

# Run an episode
time_step = env.reset()
while not time_step.last():
    action = env.action_spec().generate_value()
    time_step = env.step(action)

# Load the run task
env_run = suite.load(domain_name='humanoid', task_name='run')

# Load with pure state observations
env_pure = suite.load(domain_name='humanoid', task_name='run_pure_state')

Related Pages

Principle:Google_deepmind_Dm_control_Control_Suite_Environment_Loading

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment