Implementation:Google deepmind Dm control Reference Pose Tracking
| Knowledge Sources | |
|---|---|
| Domains | Robotics Simulation, Reinforcement Learning, Motion Capture |
| Last Updated | 2026-02-15 04:00 GMT |
Overview
This module defines the core multi-clip motion capture tracking tasks for reinforcement learning, providing the abstract ReferencePosesTask base class and concrete implementations MultiClipMocapTracking and PlaybackTask.
Description
The tracking module implements the complete motion capture tracking task pipeline used in the CoMic project and related research. The abstract base class ReferencePosesTask extends composer.Task and handles: loading reference trajectory data from HDF5 files via HDF5TrajectoryLoader; managing clip selection with weighted probability sampling across start steps; computing termination errors as a weighted combination of mean absolute joint error and body position error; adding extensive reference observations (relative joints, body positions/quaternions in global and local frames, ego-centric body quaternions, root quaternion/position, appendages, and prop positions/quaternions); and delegating reward computation to pluggable reward functions from rewards.py. It also supports props (manipulable objects in the scene), ghost walkers for visualization of reference poses, and custom action specs bounded by actuator control ranges.
MultiClipMocapTracking extends the base class with per-step time tracking, walker feature updates at each step, error computation triggering termination when thresholds are exceeded, and a normalized time-in-clip observable. It also adds velocity, gyro, and joint velocity observables computed at the control timestep.
PlaybackTask provides a simple non-learning playback mode that cycles through clips sequentially (rather than randomly sampling), sets the walker pose directly from reference data at each step, and always returns zero reward. It is useful for visualizing motion capture data without training.
The helper function _strip_reference_prefix processes dictionary keys by removing a given prefix (e.g., walker/) from keys and optionally retaining keys with specified prefixes (e.g., prop/).
Usage
Use MultiClipMocapTracking to train reinforcement learning agents to imitate human motion capture data. Use PlaybackTask to visualize and inspect reference motion capture clips without training. Both tasks require a walker constructor, arena, path to HDF5 reference data, and a dataset specification (either a ClipCollection or a named dataset string).
Code Reference
Source Location
- Repository: Google_deepmind_Dm_control
- File: dm_control/locomotion/tasks/reference_pose/tracking.py
- Lines: 1-1007
Signature
DEFAULT_PHYSICS_TIMESTEP = 0.005
class ReferencePosesTask(composer.Task, metaclass=abc.ABCMeta):
def __init__(
self,
walker: Callable[..., 'legacy_base.Walker'],
arena: composer.Arena,
ref_path: Text,
ref_steps: Sequence[int],
dataset: Union[Text, types.ClipCollection],
termination_error_threshold: float = 0.3,
prop_termination_error_threshold: float = 0.1,
min_steps: int = 10,
reward_type: Text = 'termination_reward',
physics_timestep: float = DEFAULT_PHYSICS_TIMESTEP,
always_init_at_clip_start: bool = False,
proto_modifier: Optional[Any] = None,
prop_factory: Optional[Any] = None,
disable_props: bool = False,
ghost_offset: Optional[Sequence[Union[int, float]]] = None,
body_error_multiplier: Union[int, float] = 1.0,
actuator_force_coeff: float = 0.015,
enabled_reference_observables: Optional[Sequence[Text]] = None,
): ...
# Key methods
def initialize_episode_mjcf(self, random_state): ...
def initialize_episode(self, physics, random_state): ...
def before_step(self, physics, action, random_state): ...
def after_step(self, physics, random_state): ...
def should_terminate_episode(self, physics): ...
def get_discount(self, physics): ...
def get_reward(self, physics) -> float: ...
def action_spec(self, physics): ...
name: property # abstract
root_entity: property
class MultiClipMocapTracking(ReferencePosesTask):
def __init__(self, walker, arena, ref_path, ref_steps, dataset, ...): ...
def after_step(self, physics, random_state): ...
def get_normalized_time_in_clip(self, physics): ...
name: property # 'MultiClipMocapTracking'
class PlaybackTask(ReferencePosesTask):
def __init__(self, walker, arena, ref_path, dataset, ...): ...
def after_step(self, physics, random_state): ...
def get_reward(self, physics): ... # returns 0.0
name: property # 'PlaybackTask'
Import
from dm_control.locomotion.tasks.reference_pose import tracking
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| walker | Callable[..., Walker] | Yes | Constructor for the walker to be used in the task |
| arena | composer.Arena | Yes | The arena providing the ground and scene structure |
| ref_path | str | Yes | Path to the HDF5 dataset containing reference poses |
| ref_steps | Sequence[int] | Yes | Indices of future reference steps to include in observations (e.g., (1, 2, 3)) |
| dataset | str or ClipCollection | Yes | A ClipCollection instance or a named dataset key |
| termination_error_threshold | float | No | Error threshold for episode termination (default: 0.3) |
| prop_termination_error_threshold | float | No | Error threshold for prop position termination (default: 0.1) |
| min_steps | int | No | Minimum number of steps per episode (default: 10) |
| reward_type | str | No | Reward function name from rewards.py (default: 'termination_reward') |
| physics_timestep | float | No | Physics simulation timestep (default: 0.005) |
| always_init_at_clip_start | bool | No | Only start episodes at clip beginning (default: False) |
| ghost_offset | Sequence or None | No | Position offset for ghost walker visualization |
| body_error_multiplier | float | No | Multiplier for body position error in termination (default: 1.0) |
| actuator_force_coeff | float | No | Coefficient for actuator force reward channel (default: 0.015) |
Outputs
| Name | Type | Description |
|---|---|---|
| reward | float | Scalar reward computed by the pluggable reward function |
| discount | float | 0.0 if truncated, 1.0 otherwise |
| should_terminate | bool | True if error exceeds threshold or end of mocap reached |
| reference observations | dict[str, np.ndarray] | Dictionary of reference observations (relative joints, body positions, quaternions, etc.) |
| last_reward_channels | OrderedDict or None | Per-channel reward breakdown |
Usage Examples
Basic Usage
from dm_control import composer
from dm_control.locomotion.tasks.reference_pose import tracking
from dm_control.locomotion.tasks.reference_pose import cmu_subsets
# Create a multi-clip tracking task
task = tracking.MultiClipMocapTracking(
walker=my_walker_constructor,
arena=my_arena,
ref_path='/path/to/cmu_mocap.hdf5',
ref_steps=(1, 2, 3, 4, 5),
dataset=cmu_subsets.LOCOMOTION_SMALL,
termination_error_threshold=0.3,
reward_type='termination_reward',
physics_timestep=0.005,
)
# Create a Composer environment
env = composer.Environment(task)
timestep = env.reset()
# Run an episode
while not timestep.last():
action = my_policy(timestep.observation)
timestep = env.step(action)
# For visualization/debugging, use PlaybackTask
playback = tracking.PlaybackTask(
walker=my_walker_constructor,
arena=my_arena,
ref_path='/path/to/cmu_mocap.hdf5',
dataset=cmu_subsets.WALK_TINY,
)