Implementation:Google deepmind Dm control Reference Pose Tracking

Knowledge Sources	Google_deepmind_Dm_control
Domains	Robotics Simulation, Reinforcement Learning, Motion Capture
Last Updated	2026-02-15 04:00 GMT

Overview

This module defines the core multi-clip motion capture tracking tasks for reinforcement learning, providing the abstract ReferencePosesTask base class and concrete implementations MultiClipMocapTracking and PlaybackTask.

Description

The tracking module implements the complete motion capture tracking task pipeline used in the CoMic project and related research. The abstract base class ReferencePosesTask extends composer.Task and handles: loading reference trajectory data from HDF5 files via HDF5TrajectoryLoader; managing clip selection with weighted probability sampling across start steps; computing termination errors as a weighted combination of mean absolute joint error and body position error; adding extensive reference observations (relative joints, body positions/quaternions in global and local frames, ego-centric body quaternions, root quaternion/position, appendages, and prop positions/quaternions); and delegating reward computation to pluggable reward functions from rewards.py. It also supports props (manipulable objects in the scene), ghost walkers for visualization of reference poses, and custom action specs bounded by actuator control ranges.

MultiClipMocapTracking extends the base class with per-step time tracking, walker feature updates at each step, error computation triggering termination when thresholds are exceeded, and a normalized time-in-clip observable. It also adds velocity, gyro, and joint velocity observables computed at the control timestep.

PlaybackTask provides a simple non-learning playback mode that cycles through clips sequentially (rather than randomly sampling), sets the walker pose directly from reference data at each step, and always returns zero reward. It is useful for visualizing motion capture data without training.

The helper function _strip_reference_prefix processes dictionary keys by removing a given prefix (e.g., walker/) from keys and optionally retaining keys with specified prefixes (e.g., prop/).

Usage

Use MultiClipMocapTracking to train reinforcement learning agents to imitate human motion capture data. Use PlaybackTask to visualize and inspect reference motion capture clips without training. Both tasks require a walker constructor, arena, path to HDF5 reference data, and a dataset specification (either a ClipCollection or a named dataset string).

Code Reference

Source Location

Repository: Google_deepmind_Dm_control
File: dm_control/locomotion/tasks/reference_pose/tracking.py
Lines: 1-1007

Signature

DEFAULT_PHYSICS_TIMESTEP = 0.005

class ReferencePosesTask(composer.Task, metaclass=abc.ABCMeta):
    def __init__(
        self,
        walker: Callable[..., 'legacy_base.Walker'],
        arena: composer.Arena,
        ref_path: Text,
        ref_steps: Sequence[int],
        dataset: Union[Text, types.ClipCollection],
        termination_error_threshold: float = 0.3,
        prop_termination_error_threshold: float = 0.1,
        min_steps: int = 10,
        reward_type: Text = 'termination_reward',
        physics_timestep: float = DEFAULT_PHYSICS_TIMESTEP,
        always_init_at_clip_start: bool = False,
        proto_modifier: Optional[Any] = None,
        prop_factory: Optional[Any] = None,
        disable_props: bool = False,
        ghost_offset: Optional[Sequence[Union[int, float]]] = None,
        body_error_multiplier: Union[int, float] = 1.0,
        actuator_force_coeff: float = 0.015,
        enabled_reference_observables: Optional[Sequence[Text]] = None,
    ): ...

    # Key methods
    def initialize_episode_mjcf(self, random_state): ...
    def initialize_episode(self, physics, random_state): ...
    def before_step(self, physics, action, random_state): ...
    def after_step(self, physics, random_state): ...
    def should_terminate_episode(self, physics): ...
    def get_discount(self, physics): ...
    def get_reward(self, physics) -> float: ...
    def action_spec(self, physics): ...
    name: property  # abstract
    root_entity: property

class MultiClipMocapTracking(ReferencePosesTask):
    def __init__(self, walker, arena, ref_path, ref_steps, dataset, ...): ...
    def after_step(self, physics, random_state): ...
    def get_normalized_time_in_clip(self, physics): ...
    name: property  # 'MultiClipMocapTracking'

class PlaybackTask(ReferencePosesTask):
    def __init__(self, walker, arena, ref_path, dataset, ...): ...
    def after_step(self, physics, random_state): ...
    def get_reward(self, physics): ...  # returns 0.0
    name: property  # 'PlaybackTask'

Import

from dm_control.locomotion.tasks.reference_pose import tracking

I/O Contract

Inputs

Name	Type	Required	Description
walker	Callable[..., Walker]	Yes	Constructor for the walker to be used in the task
arena	composer.Arena	Yes	The arena providing the ground and scene structure
ref_path	str	Yes	Path to the HDF5 dataset containing reference poses
ref_steps	Sequence[int]	Yes	Indices of future reference steps to include in observations (e.g., (1, 2, 3))
dataset	str or ClipCollection	Yes	A ClipCollection instance or a named dataset key
termination_error_threshold	float	No	Error threshold for episode termination (default: 0.3)
prop_termination_error_threshold	float	No	Error threshold for prop position termination (default: 0.1)
min_steps	int	No	Minimum number of steps per episode (default: 10)
reward_type	str	No	Reward function name from rewards.py (default: 'termination_reward')
physics_timestep	float	No	Physics simulation timestep (default: 0.005)
always_init_at_clip_start	bool	No	Only start episodes at clip beginning (default: False)
ghost_offset	Sequence or None	No	Position offset for ghost walker visualization
body_error_multiplier	float	No	Multiplier for body position error in termination (default: 1.0)
actuator_force_coeff	float	No	Coefficient for actuator force reward channel (default: 0.015)

Outputs

Name	Type	Description
reward	float	Scalar reward computed by the pluggable reward function
discount	float	0.0 if truncated, 1.0 otherwise
should_terminate	bool	True if error exceeds threshold or end of mocap reached
reference observations	dict[str, np.ndarray]	Dictionary of reference observations (relative joints, body positions, quaternions, etc.)
last_reward_channels	OrderedDict or None	Per-channel reward breakdown

Usage Examples

Basic Usage

from dm_control import composer
from dm_control.locomotion.tasks.reference_pose import tracking
from dm_control.locomotion.tasks.reference_pose import cmu_subsets

# Create a multi-clip tracking task
task = tracking.MultiClipMocapTracking(
    walker=my_walker_constructor,
    arena=my_arena,
    ref_path='/path/to/cmu_mocap.hdf5',
    ref_steps=(1, 2, 3, 4, 5),
    dataset=cmu_subsets.LOCOMOTION_SMALL,
    termination_error_threshold=0.3,
    reward_type='termination_reward',
    physics_timestep=0.005,
)

# Create a Composer environment
env = composer.Environment(task)
timestep = env.reset()

# Run an episode
while not timestep.last():
    action = my_policy(timestep.observation)
    timestep = env.step(action)

# For visualization/debugging, use PlaybackTask
playback = tracking.PlaybackTask(
    walker=my_walker_constructor,
    arena=my_arena,
    ref_path='/path/to/cmu_mocap.hdf5',
    dataset=cmu_subsets.WALK_TINY,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment