Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Danijar Dreamerv3 Logging And Reporting

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Monitoring
Last Updated 2026-02-15 09:00 GMT

Overview

A multi-backend logging and diagnostic reporting system that tracks training metrics, episode statistics, and generates open-loop video predictions to monitor world model quality.

Description

Logging and Reporting in DreamerV3 serves two distinct functions:

Logging: Aggregates and writes scalar metrics (losses, rewards, FPS, usage) to multiple output backends including terminal, JSONL files, TensorBoard, Weights and Biases, and custom outputs. Metrics are collected via elements.Agg (aggregator) and written periodically via elements.Logger.

Reporting: The Agent.report() method generates diagnostic evaluations by running the world model in open-loop mode — observing the first half of a sequence, then predicting the second half purely from imagination. The resulting video predictions (ground truth vs predicted vs error) are logged as image grids, providing visual assessment of world model quality.

This dual system enables both quantitative monitoring (loss curves, reward trends) and qualitative assessment (does the world model predict plausible futures?).

Usage

Use logging throughout the training loop at regular intervals (controlled by log_every). Reporting is triggered less frequently (controlled by report_every) since it requires additional forward passes through the world model.

Theoretical Basis

Open-Loop Prediction:

# Abstract algorithm for reporting
# Phase 1: Observe first half
for t in range(T // 2):
    state = world_model.observe(state, obs[t], action[t])

# Phase 2: Imagine second half (no observations)
for t in range(T // 2, T):
    state = world_model.imagine(state, action[t])
    predicted_obs[t] = decoder(state)

# Compare predicted vs actual observations
video = concatenate([actual, predicted, error])

The open-loop prediction quality directly reflects world model accuracy — a well-trained model produces visually coherent multi-step predictions.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment