Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control Suite Humanoid CMU

From Leeroopedia
Metadata Value
Implementation Suite Humanoid CMU
Domain Reinforcement_Learning, Control
Source Google_deepmind_Dm_control
Last Updated 2026-02-15 04:00 GMT

Overview

Concrete tool for controlling a high-dimensional CMU humanoid model to stand, walk, or run provided by the dm_control Control Suite.

Description

The Humanoid CMU domain uses a more detailed humanoid model based on the Carnegie Mellon University motion capture skeleton. Compared to the standard Humanoid domain, this model has many more degrees of freedom and uses the thorax as the reference body instead of the torso. The Physics subclass provides methods for reading thorax uprightness (y-axis to world z-axis projection), head height, center-of-mass position and velocity, torso vertical orientation (from the thorax frame), joint angles (excluding the 7 root free-joint DOFs), and extremity positions (using left/right hand/foot naming with short prefixes 'l'/'r').

Three tasks are registered (all non-benchmarking): stand, walk, and run. All use the HumanoidCMU task class parameterized by move_speed. The stand task sets move_speed=0 and rewards standing upright with the head above 1.4 m while penalizing horizontal movement and large controls. The walk and run tasks set move_speed to 1 and 10 m/s respectively, rewarding horizontal locomotion at or above the target speed.

Episode initialization randomizes all limited and rotational joints to a collision-free configuration. All tasks use a default time limit of 20 seconds and a control timestep of 0.02 seconds. The reward structure mirrors the standard Humanoid domain but references thorax instead of torso for uprightness calculations.

Usage

Use this implementation for high-dimensional locomotion tasks with the CMU humanoid skeleton. Load via suite.load(domain_name='humanoid_CMU', task_name='stand') or the walk/run variants.

Code Reference

Source Location

Signature

# Task factory functions
def stand(time_limit=20, random=None, environment_kwargs=None)
def walk(time_limit=20, random=None, environment_kwargs=None)
def run(time_limit=20, random=None, environment_kwargs=None)

# Physics subclass
class Physics(mujoco.Physics):
    def thorax_upright(self)               # y-axis of thorax to world z
    def head_height(self)                  # height of head
    def center_of_mass_position(self)      # CoM position from thorax subtree
    def center_of_mass_velocity(self)      # CoM velocity from sensor
    def torso_vertical_orientation(self)   # z-projection of thorax orientation
    def joint_angles(self)                 # joint positions sans root DOFs
    def extremities(self)                  # hand/foot positions in thorax frame

# Task class
class HumanoidCMU(base.Task):
    def __init__(self, move_speed, random=None)
    def initialize_episode(self, physics)
    def get_observation(self, physics)
    def get_reward(self, physics)

Import

from dm_control import suite

env = suite.load(domain_name='humanoid_CMU', task_name='stand')

I/O Contract

Inputs

Name Type Required Description
time_limit float No Maximum episode duration in seconds (default 20).
random int, numpy.random.RandomState, or None No Random seed or RNG instance for reproducibility.
environment_kwargs dict or None No Additional keyword arguments forwarded to the Environment constructor.

Outputs

Name Type Description
environment dm_control.rl.control.Environment A fully initialised environment conforming to the dm_env.Environment interface.

Observations

Key Type Description
joint_angles numpy array Joint positions excluding the 7 root free-joint DOFs.
head_height float Height of the head above the ground.
extremities numpy array (12,) Left/right hand and foot positions in thorax frame.
torso_vertical numpy array (3,) Z-projection of thorax orientation matrix.
com_velocity numpy array (3,) Center-of-mass velocity from sensor.
velocity numpy array Joint velocities.

Usage Examples

from dm_control import suite

# Load the CMU humanoid stand task
env = suite.load(domain_name='humanoid_CMU', task_name='stand')

# Run an episode
time_step = env.reset()
while not time_step.last():
    action = env.action_spec().generate_value()
    time_step = env.step(action)

# Load the walk and run tasks
env_walk = suite.load(domain_name='humanoid_CMU', task_name='walk')
env_run = suite.load(domain_name='humanoid_CMU', task_name='run')

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment