Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Google deepmind Dm control MuJoCo Profiling Wrapper

From Leeroopedia
Revision as of 12:43, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Google_deepmind_Dm_control_MuJoCo_Profiling_Wrapper.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Reinforcement Learning, Performance Profiling, Physics Simulation
Last Updated 2026-02-15 04:00 GMT

Overview

The MuJoCo Profiling Wrapper adds MuJoCo step-timing profiling data as an additional observation to any dm_control environment, enabling performance analysis of physics simulation steps.

Description

The Wrapper class implements the dm_env.Environment interface and wraps an existing dm_control environment to inject profiling information into observations. At initialization, it enables profiling on the underlying physics engine and extends the observation specification with a 2-element float64 array containing the cumulative step duration (in seconds) and the number of times the step timer was called.

The wrapper handles both dict-based and array-based observation formats. For array-based observations, it converts them to an OrderedDict with the original observation stored under the key 'state'. The profiling data is read from physics.data.timer[0], which corresponds to MuJoCo's step timer (as defined in mujoco/include/mjdata.h). The profiling observation is appended on every reset() and step() call.

The wrapper delegates all unrecognized attribute accesses to the underlying environment via __getattr__, making it transparent for most use cases. The profiling observation key defaults to 'step_timing' but can be customized at initialization.

Usage

Use this wrapper when you need to measure and analyze the computational cost of MuJoCo simulation steps during RL training, evaluation, or benchmarking. Wrap any existing dm_control environment to add timing observations.

Code Reference

Source Location

Signature

STATE_KEY = 'state'

class Wrapper(dm_env.Environment):
    def __init__(self, env, observation_key='step_timing'): ...
    def reset(self): ...
    def step(self, action): ...
    def observation_spec(self): ...
    def action_spec(self): ...

Import

from dm_control.suite.wrappers import mujoco_profiling

I/O Contract

Inputs

Name Type Required Description
env dm_env.Environment Yes The environment to wrap with profiling observations
observation_key str No Key name for the profiling observation in the observation dict (default 'step_timing')
action np.ndarray Yes (for step) Action to apply to the wrapped environment

Outputs

Name Type Description
time_step dm_env.TimeStep Original time step with augmented observation containing profiling data
observation[observation_key] np.ndarray (shape (2,), dtype float64) Array of [cumulative_duration, call_count] from MuJoCo's step timer

Usage Examples

from dm_control import suite
from dm_control.suite.wrappers import mujoco_profiling

# Load and wrap an environment
env = suite.load('cartpole', 'balance')
profiled_env = mujoco_profiling.Wrapper(env)

# Step through the profiled environment
time_step = profiled_env.reset()
print("Profiling data:", time_step.observation['step_timing'])

action = profiled_env.action_spec().generate_value()
time_step = profiled_env.step(action)
duration, count = time_step.observation['step_timing']
print(f"Step duration: {duration:.6f}s, Timer calls: {count}")

# Use a custom observation key
profiled_env = mujoco_profiling.Wrapper(env, observation_key='perf_data')

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment