Heuristic:Google deepmind Dm control Physics Timestep Configuration

Knowledge Sources	dm_control engine Tassa et al. 2020
Domains	Physics_Simulation, Reinforcement_Learning, Optimization
Last Updated	2026-02-15 05:00 GMT

Overview

Configuration rules for physics and control timesteps, including the legacy step behavior and integrator-specific stepping patterns.

Description

dm_control separates physics timestep (simulation resolution, typically 0.002s) from control timestep (agent decision frequency). The control timestep must be an exact integer multiple of the physics timestep. Additionally, dm_control implements a "legacy step" mode that ensures mjData fields stay synchronized with qpos/qvel after each step, which requires different stepping patterns depending on the integrator (Euler vs RK4). Understanding these mechanics is essential for correct simulation behavior.

Usage

Apply this heuristic when configuring a new environment, debugging stale sensor readings, or migrating from n_sub_steps to control_timestep. Also relevant when encountering unexpected physics behavior with non-Euler integrators.

The Insight (Rule of Thumb)

Action 1: Set control timestep as an integer multiple of physics timestep using the Task's `control_timestep` property.
Value: Default physics timestep is 0.002s (2ms). For 25Hz control, use `control_timestep = 0.04` (20 physics substeps).
Action 2: Do NOT use the deprecated `n_sub_steps` parameter. Use `task.control_timestep` instead.
Action 3: When using legacy step mode with non-Euler integrators (RK4), be aware that an additional `mj_step1` call is made after the final `mj_step` to sync derived fields.
Trade-off: Legacy step mode adds overhead from the extra `mj_step1` call but ensures sensor readings and derived quantities are consistent with the current state. Non-legacy mode (newer default) is simpler but may have slightly different sensor timing.

Reasoning

The legacy step behavior exists because MuJoCo's `mj_step` leaves some derived fields (accelerations, contacts, sensor data) computed from the previous state. The extra `mj_step1` call recomputes position/velocity-dependent fields without advancing time, ensuring observation consistency. This is documented in Tassa et al. 2020 (arXiv:2006.12983, page 6) and is critical for RL agents that rely on accurate sensor readings at each control step.

The integer-multiple constraint ensures that the control loop executes a whole number of physics steps per decision, preventing fractional stepping artifacts.

Code evidence from `mujoco/engine.py:147-162`:

def _step_with_up_to_date_position_velocity(self, nstep: int = 1) -> None:
    # In the case of Euler integration we assume mj_step1 has already been
    # called for this state, finish the step with mj_step2 and then update
    # all position and velocity related fields with mj_step1.
    # This ensures that (most of) mjData is in sync with qpos and qvel.
    # In the case of non-Euler integrators (e.g. RK4) an additional
    # mj_step1 must be called after the last mj_step.
    if self.model.opt.integrator != mujoco.mjtIntegrator.mjINT_RK4.value:
        mujoco.mj_step2(self.model.ptr, self.data.ptr)
        if nstep > 1:
            mujoco.mj_step(self.model.ptr, self.data.ptr, nstep-1)
    else:
        mujoco.mj_step(self.model.ptr, self.data.ptr, nstep)
    mujoco.mj_step1(self.model.ptr, self.data.ptr)

Observation update timing from `composer/environment.py:424-429` (approximate):

# Final observation update must happen AFTER after_step hooks.
# Otherwise hooks modifying physics result in inconsistent observations.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment