Principle:Danijar Dreamerv3 Process Spawning

Knowledge Sources	DreamerV3
Domains	Reinforcement_Learning, Distributed_Systems
Last Updated	2026-02-15 09:00 GMT

Overview

A distributed process orchestration pattern that spawns separate OS processes for actor inference, learner training, replay management, logging, and environment stepping, connected via RPC.

Description

Process Spawning in DreamerV3 implements a multi-process architecture where each computational role runs in its own process (or thread):

Agent Process: Contains both actor (inference) and learner (training) threads sharing a single agent object
Replay Process: Manages replay buffers and data streams, enforcing rate limits
Logger Process: Aggregates metrics from all other processes
Environment Processes: One per environment, sending observations to the actor and receiving actions

All cross-process communication uses the portal library's RPC mechanism (Server/Client/BatchServer). Factory functions are serialized via cloudpickle and deserialized in their target processes. Network addresses are auto-resolved using free ports.

This architecture enables scaling: environments can run on CPU-only machines while the agent uses GPU, the replay buffer can be on a high-memory machine, and the logger runs independently.

Usage

Use this principle when config.script == 'parallel'. It replaces the single-process training loop with a distributed version. Individual processes (parallel_env, parallel_replay) can also be launched as separate jobs for remote execution.

Theoretical Basis

Pseudo-code Logic:

# Abstract algorithm
# Serialize all factory functions
factories = {name: cloudpickle.dumps(fn) for name, fn in factories.items()}

# Resolve network addresses
actor_addr = find_free_port()
replay_addr = find_free_port()
logger_addr = find_free_port()

# Spawn processes
processes = [
    Process(agent_fn, [actor_thread, learner_thread]),
    Process(replay_fn, replay_addr),
    Process(logger_fn, logger_addr),
    *[Process(env_fn, i, actor_addr) for i in range(num_envs)],
]
run_until_any_fails(processes)

Related Pages

Implemented By

Implementation:Danijar_Dreamerv3_Combined_Launcher

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment