Principle:Danijar Dreamerv3 Process Spawning
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Distributed_Systems |
| Last Updated | 2026-02-15 09:00 GMT |
Overview
A distributed process orchestration pattern that spawns separate OS processes for actor inference, learner training, replay management, logging, and environment stepping, connected via RPC.
Description
Process Spawning in DreamerV3 implements a multi-process architecture where each computational role runs in its own process (or thread):
- Agent Process: Contains both actor (inference) and learner (training) threads sharing a single agent object
- Replay Process: Manages replay buffers and data streams, enforcing rate limits
- Logger Process: Aggregates metrics from all other processes
- Environment Processes: One per environment, sending observations to the actor and receiving actions
All cross-process communication uses the portal library's RPC mechanism (Server/Client/BatchServer). Factory functions are serialized via cloudpickle and deserialized in their target processes. Network addresses are auto-resolved using free ports.
This architecture enables scaling: environments can run on CPU-only machines while the agent uses GPU, the replay buffer can be on a high-memory machine, and the logger runs independently.
Usage
Use this principle when config.script == 'parallel'. It replaces the single-process training loop with a distributed version. Individual processes (parallel_env, parallel_replay) can also be launched as separate jobs for remote execution.
Theoretical Basis
Pseudo-code Logic:
# Abstract algorithm
# Serialize all factory functions
factories = {name: cloudpickle.dumps(fn) for name, fn in factories.items()}
# Resolve network addresses
actor_addr = find_free_port()
replay_addr = find_free_port()
logger_addr = find_free_port()
# Spawn processes
processes = [
Process(agent_fn, [actor_thread, learner_thread]),
Process(replay_fn, replay_addr),
Process(logger_fn, logger_addr),
*[Process(env_fn, i, actor_addr) for i in range(num_envs)],
]
run_until_any_fails(processes)