Workflow:Google deepmind Dm control Composer Environment Building

Knowledge Sources	dm_control Composer API
Domains	Reinforcement_Learning, Environment_Design, Physics_Simulation
Last Updated	2026-02-15 12:00 GMT

Overview

End-to-end process for building custom reinforcement learning environments using the dm_control Composer framework by assembling reusable Entity, Arena, and Task components.

Description

This workflow covers the standard procedure for creating rich, customizable RL environments using the Composer framework. Composer provides a higher-level abstraction over raw MuJoCo simulation, enabling modular environment construction from three core building blocks: Entities (physical objects with observables and lifecycle hooks), Arenas (the physical world containing entities), and Tasks (reward functions, termination conditions, and episode logic). The framework handles MJCF model composition, physics compilation, observation buffering, domain randomization, and the dm_env interface automatically. The output is a fully functional dm_env-compatible Environment suitable for RL training.

Usage

Execute this workflow when you need to create a custom RL environment beyond the pre-built Control Suite, combining specific walkers, arenas, props, and task logic. This is the standard approach for building environments with complex observation spaces, multi-rate observations, domain randomization, or custom reward functions.

Execution Steps

Step 1: Define or Select an Entity

Create or select an Entity — the fundamental building block representing any physical object in the environment. An Entity wraps an MJCF model and exposes observables (joint positions, velocities, camera images), lifecycle hooks (initialize_episode, before_step, after_step), and attachment points for composing with other entities. For robotic agents, use the Robot subclass which adds actuator management.

Key considerations:

Entity is the abstract base class; Robot extends it for actuated agents
Each Entity owns an MJCF model accessible via mjcf_model property
Observables are defined as MJCFFeature, MujocoFeature, or Generic callables
Entities can be attached to other entities via MJCF site attachment points
The @composer.define.cached_property decorator enables lazy MJCF element creation

Step 2: Define or Select an Arena

Create or select an Arena — a specialized Entity that serves as the root environment. The Arena provides the ground plane, global lighting, skybox, simulation settings, and attachment points for other entities. Built-in arenas include flat floors, corridors (empty, gaps, walls), bowl-shaped terrains, and procedurally generated mazes.

Key considerations:

Arena extends Entity and serves as the root of the MJCF model hierarchy
Built-in arenas: Floor, EmptyCorridor, GapsCorridor, WallsCorridor, Bowl, RandomMazeWithTargets
Arenas define the physical boundaries and visual appearance of the world
Custom arenas can add cameras, lights, and terrain features
The add_free_entity method attaches entities with a freejoint for free-floating objects

Step 3: Define the Task

Create a Task subclass that specifies the reward function, termination conditions, episode initialization, observables configuration, and timestep management. The Task connects the walker (agent) to the arena, defines what the agent should optimize, and manages the episode lifecycle through hooks.

Key considerations:

Task is the abstract base class with required methods: get_reward, should_terminate_episode
The root_entity property returns the arena (root of the MJCF tree)
initialize_episode_mjcf is called before physics compilation for domain randomization
initialize_episode is called after compilation for physics-state initialization
Tasks manage control_timestep and physics_timestep independently

Step 4: Configure Observables

Enable and configure the observables that will be exposed to the RL agent. Observables support multi-rate updates (different observation frequencies), buffering with configurable delays, and aggregation functions. Each Entity defines its available observables; the Task selects which ones to enable and their update rates.

Key considerations:

Observables are enabled/disabled per-entity via observable_options
Multi-rate observation supports different update frequencies for different sensors
Buffer sizes and delays can simulate realistic sensor latencies
Aggregation functions (e.g., mean, max) summarize buffered observations
Camera observables provide rendered pixel arrays at configurable resolutions

Step 5: Apply Domain Randomization

Optionally configure variations to randomize environment parameters at each episode reset. The variation subsystem provides statistical distributions (Uniform, Normal, LogNormal), color randomizers (RGB, HSV, Grayscale), rotation randomizers (quaternion sampling), and noise injectors (Additive, Multiplicative). Variations are applied during initialize_episode_mjcf before physics compilation.

Key considerations:

Variations implement the Variation base class with a __call__ protocol
MJCFVariator and PhysicsVariator apply variations to MJCF attributes and physics state
Distributions wrap numpy random state for reproducible randomization
VariationBroadcaster shares a single sample across multiple consumers
Variations compose algebraically via operator overloading (+, *, -, /)

Step 6: Assemble and Launch the Environment

Instantiate the composer.Environment with the configured Task, time limit, and random state. The Environment handles MJCF model compilation to MuJoCo physics, the reset/step loop, observation collection via the Updater, and the dm_env interface. Optionally launch the interactive viewer for visualization and debugging.

Key considerations:

composer.Environment wraps the Task and manages the simulation lifecycle
strip_singleton_obs_buffer_dim removes unnecessary buffer dimensions for single-step observations
The environment automatically recompiles physics when MJCF changes during reset
Random state can be seeded for reproducible episodes
The viewer.launch function accepts the environment for interactive visualization

Execution Diagram

GitHub URL

Workflow Repository