Principle:Danijar Dreamerv3 Environment Construction

Knowledge Sources	Mastering Diverse Domains through World Models DreamerV3
Domains	Reinforcement_Learning, Environment
Last Updated	2026-02-15 09:00 GMT

Overview

A factory pattern for constructing and wrapping diverse RL environments behind a unified interface, enabling a single agent to operate across fundamentally different domains.

Description

Environment Construction in DreamerV3 uses a registry-based factory to instantiate environment objects from a task string of the form suite_taskname (e.g., atari_pong, dmc_walker_walk, crafter_reward). The factory maps suite names to constructor classes, instantiates the environment, then applies a standard chain of wrappers (action normalization, dtype unification, space checking, action clipping) to ensure all environments expose a consistent interface regardless of their underlying API (Gym, DeepMind, custom).

This solves the problem of running a single RL algorithm across 150+ tasks spanning Atari, DeepMind Control Suite, Crafter, DMLab, Minecraft, ProcGen, BSuite, and custom environments — without any environment-specific code in the agent.

Usage

Use this principle whenever creating environment instances for training, evaluation, or distributed data collection. It is always the second step after configuration loading, and produces the obs_space and act_space dictionaries that define the agent's interface.

Theoretical Basis

The environment construction follows the Abstract Factory pattern:

Pseudo-code Logic:

# Abstract algorithm
suite, task = parse_task_string(config.task)  # "atari_pong" -> ("atari", "pong")
constructor = REGISTRY[suite]                  # Look up environment class
env = constructor(task, **suite_config)         # Instantiate
env = apply_wrappers(env, config)              # Normalize interface
# env now exposes: obs_space, act_space, step(action) -> obs

The wrapper chain ensures:

NormalizeAction: Continuous actions mapped to [-1, 1]
UnifyDtypes: Consistent observation dtypes
CheckSpaces: Runtime validation of obs/act shapes
ClipAction: Clamp continuous actions to valid range

Related Pages

Implemented By

Implementation:Danijar_Dreamerv3_Make_Env

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment