Principle:Danijar Dreamerv3 Environment Construction
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Environment |
| Last Updated | 2026-02-15 09:00 GMT |
Overview
A factory pattern for constructing and wrapping diverse RL environments behind a unified interface, enabling a single agent to operate across fundamentally different domains.
Description
Environment Construction in DreamerV3 uses a registry-based factory to instantiate environment objects from a task string of the form suite_taskname (e.g., atari_pong, dmc_walker_walk, crafter_reward). The factory maps suite names to constructor classes, instantiates the environment, then applies a standard chain of wrappers (action normalization, dtype unification, space checking, action clipping) to ensure all environments expose a consistent interface regardless of their underlying API (Gym, DeepMind, custom).
This solves the problem of running a single RL algorithm across 150+ tasks spanning Atari, DeepMind Control Suite, Crafter, DMLab, Minecraft, ProcGen, BSuite, and custom environments — without any environment-specific code in the agent.
Usage
Use this principle whenever creating environment instances for training, evaluation, or distributed data collection. It is always the second step after configuration loading, and produces the obs_space and act_space dictionaries that define the agent's interface.
Theoretical Basis
The environment construction follows the Abstract Factory pattern:
Pseudo-code Logic:
# Abstract algorithm
suite, task = parse_task_string(config.task) # "atari_pong" -> ("atari", "pong")
constructor = REGISTRY[suite] # Look up environment class
env = constructor(task, **suite_config) # Instantiate
env = apply_wrappers(env, config) # Normalize interface
# env now exposes: obs_space, act_space, step(action) -> obs
The wrapper chain ensures:
- NormalizeAction: Continuous actions mapped to [-1, 1]
- UnifyDtypes: Consistent observation dtypes
- CheckSpaces: Runtime validation of obs/act shapes
- ClipAction: Clamp continuous actions to valid range