Principle:Danijar Dreamerv3 Checkpoint Management

Knowledge Sources	DreamerV3
Domains	Reinforcement_Learning, Training_Infrastructure
Last Updated	2026-02-15 09:00 GMT

Overview

A persistence mechanism that saves and restores the complete training state (agent parameters, replay buffer, step counter) to enable fault-tolerant training and pretrained model evaluation.

Description

Checkpoint Management in DreamerV3 provides two core operations: save (serialize all registered state to disk) and load_or_save (restore from an existing checkpoint if present, otherwise save the initial state). This enables:

Fault tolerance: Training can resume from the last checkpoint after crashes or preemption
Evaluation: Pretrained agents can be loaded for evaluation-only runs
Transfer learning: Selective loading of parameters from a pretrained checkpoint via regex filtering

The checkpoint system is component-based: each component (agent, replay buffer, step counter) is registered by name on the checkpoint object. The from_checkpoint option allows loading from a different checkpoint path with optional regex-based parameter filtering.

Usage

Use this principle after agent and replay initialization but before the training loop begins. Checkpoints are saved periodically during training (controlled by save_every config). For evaluation-only runs, checkpoint loading is mandatory (the agent has random parameters until loaded).

Theoretical Basis

Pseudo-code Logic:

# Abstract algorithm
checkpoint = Checkpoint(path)
checkpoint.register("agent", agent)
checkpoint.register("replay", replay)
checkpoint.register("step", step_counter)

if checkpoint_exists(path):
    checkpoint.load()      # Restore all registered components
else:
    checkpoint.save()      # Save initial state as baseline

# Optionally load from a different pretrained checkpoint:
if from_checkpoint:
    checkpoint.load(from_checkpoint, keys=["agent"], regex=filter_pattern)

Related Pages

Implemented By

Implementation:Danijar_Dreamerv3_Checkpoint_Operations

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment