Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control Reference Pose Tracking

From Leeroopedia
Revision as of 12:43, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Google_deepmind_Dm_control_Reference_Pose_Tracking.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Robotics Simulation, Reinforcement Learning, Motion Capture
Last Updated 2026-02-15 04:00 GMT

Overview

This module defines the core multi-clip motion capture tracking tasks for reinforcement learning, providing the abstract ReferencePosesTask base class and concrete implementations MultiClipMocapTracking and PlaybackTask.

Description

The tracking module implements the complete motion capture tracking task pipeline used in the CoMic project and related research. The abstract base class ReferencePosesTask extends composer.Task and handles: loading reference trajectory data from HDF5 files via HDF5TrajectoryLoader; managing clip selection with weighted probability sampling across start steps; computing termination errors as a weighted combination of mean absolute joint error and body position error; adding extensive reference observations (relative joints, body positions/quaternions in global and local frames, ego-centric body quaternions, root quaternion/position, appendages, and prop positions/quaternions); and delegating reward computation to pluggable reward functions from rewards.py. It also supports props (manipulable objects in the scene), ghost walkers for visualization of reference poses, and custom action specs bounded by actuator control ranges.

MultiClipMocapTracking extends the base class with per-step time tracking, walker feature updates at each step, error computation triggering termination when thresholds are exceeded, and a normalized time-in-clip observable. It also adds velocity, gyro, and joint velocity observables computed at the control timestep.

PlaybackTask provides a simple non-learning playback mode that cycles through clips sequentially (rather than randomly sampling), sets the walker pose directly from reference data at each step, and always returns zero reward. It is useful for visualizing motion capture data without training.

The helper function _strip_reference_prefix processes dictionary keys by removing a given prefix (e.g., walker/) from keys and optionally retaining keys with specified prefixes (e.g., prop/).

Usage

Use MultiClipMocapTracking to train reinforcement learning agents to imitate human motion capture data. Use PlaybackTask to visualize and inspect reference motion capture clips without training. Both tasks require a walker constructor, arena, path to HDF5 reference data, and a dataset specification (either a ClipCollection or a named dataset string).

Code Reference

Source Location

Signature

DEFAULT_PHYSICS_TIMESTEP = 0.005

class ReferencePosesTask(composer.Task, metaclass=abc.ABCMeta):
    def __init__(
        self,
        walker: Callable[..., 'legacy_base.Walker'],
        arena: composer.Arena,
        ref_path: Text,
        ref_steps: Sequence[int],
        dataset: Union[Text, types.ClipCollection],
        termination_error_threshold: float = 0.3,
        prop_termination_error_threshold: float = 0.1,
        min_steps: int = 10,
        reward_type: Text = 'termination_reward',
        physics_timestep: float = DEFAULT_PHYSICS_TIMESTEP,
        always_init_at_clip_start: bool = False,
        proto_modifier: Optional[Any] = None,
        prop_factory: Optional[Any] = None,
        disable_props: bool = False,
        ghost_offset: Optional[Sequence[Union[int, float]]] = None,
        body_error_multiplier: Union[int, float] = 1.0,
        actuator_force_coeff: float = 0.015,
        enabled_reference_observables: Optional[Sequence[Text]] = None,
    ): ...

    # Key methods
    def initialize_episode_mjcf(self, random_state): ...
    def initialize_episode(self, physics, random_state): ...
    def before_step(self, physics, action, random_state): ...
    def after_step(self, physics, random_state): ...
    def should_terminate_episode(self, physics): ...
    def get_discount(self, physics): ...
    def get_reward(self, physics) -> float: ...
    def action_spec(self, physics): ...
    name: property  # abstract
    root_entity: property

class MultiClipMocapTracking(ReferencePosesTask):
    def __init__(self, walker, arena, ref_path, ref_steps, dataset, ...): ...
    def after_step(self, physics, random_state): ...
    def get_normalized_time_in_clip(self, physics): ...
    name: property  # 'MultiClipMocapTracking'

class PlaybackTask(ReferencePosesTask):
    def __init__(self, walker, arena, ref_path, dataset, ...): ...
    def after_step(self, physics, random_state): ...
    def get_reward(self, physics): ...  # returns 0.0
    name: property  # 'PlaybackTask'

Import

from dm_control.locomotion.tasks.reference_pose import tracking

I/O Contract

Inputs

Name Type Required Description
walker Callable[..., Walker] Yes Constructor for the walker to be used in the task
arena composer.Arena Yes The arena providing the ground and scene structure
ref_path str Yes Path to the HDF5 dataset containing reference poses
ref_steps Sequence[int] Yes Indices of future reference steps to include in observations (e.g., (1, 2, 3))
dataset str or ClipCollection Yes A ClipCollection instance or a named dataset key
termination_error_threshold float No Error threshold for episode termination (default: 0.3)
prop_termination_error_threshold float No Error threshold for prop position termination (default: 0.1)
min_steps int No Minimum number of steps per episode (default: 10)
reward_type str No Reward function name from rewards.py (default: 'termination_reward')
physics_timestep float No Physics simulation timestep (default: 0.005)
always_init_at_clip_start bool No Only start episodes at clip beginning (default: False)
ghost_offset Sequence or None No Position offset for ghost walker visualization
body_error_multiplier float No Multiplier for body position error in termination (default: 1.0)
actuator_force_coeff float No Coefficient for actuator force reward channel (default: 0.015)

Outputs

Name Type Description
reward float Scalar reward computed by the pluggable reward function
discount float 0.0 if truncated, 1.0 otherwise
should_terminate bool True if error exceeds threshold or end of mocap reached
reference observations dict[str, np.ndarray] Dictionary of reference observations (relative joints, body positions, quaternions, etc.)
last_reward_channels OrderedDict or None Per-channel reward breakdown

Usage Examples

Basic Usage

from dm_control import composer
from dm_control.locomotion.tasks.reference_pose import tracking
from dm_control.locomotion.tasks.reference_pose import cmu_subsets

# Create a multi-clip tracking task
task = tracking.MultiClipMocapTracking(
    walker=my_walker_constructor,
    arena=my_arena,
    ref_path='/path/to/cmu_mocap.hdf5',
    ref_steps=(1, 2, 3, 4, 5),
    dataset=cmu_subsets.LOCOMOTION_SMALL,
    termination_error_threshold=0.3,
    reward_type='termination_reward',
    physics_timestep=0.005,
)

# Create a Composer environment
env = composer.Environment(task)
timestep = env.reset()

# Run an episode
while not timestep.last():
    action = my_policy(timestep.observation)
    timestep = env.step(action)

# For visualization/debugging, use PlaybackTask
playback = tracking.PlaybackTask(
    walker=my_walker_constructor,
    arena=my_arena,
    ref_path='/path/to/cmu_mocap.hdf5',
    dataset=cmu_subsets.WALK_TINY,
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment