Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Farama Foundation Gymnasium Vector Episode Statistics

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Monitoring
Last Updated 2026-02-15 03:00 GMT

Overview

An extension of episode statistics tracking to vectorized environments that monitors per-environment cumulative rewards and episode lengths across parallel sub-environments.

Description

Vector Episode Statistics extends the single-environment episode tracking pattern to batched settings. Each sub-environment's cumulative reward and episode length is tracked independently. When any sub-environment completes an episode, its statistics are recorded in the shared buffers.

The key difference from single-environment tracking is the handling of partial completion: in a batch of N environments, only some may finish in any given step. The wrapper maintains per-environment accumulators and flushes statistics as each individual environment reaches episode boundaries.

Usage

Use this for monitoring training progress in vectorized A2C, PPO, or other batched RL algorithms. The return_queue and length_queue provide rolling statistics across all sub-environments.

Theoretical Basis

Per-environment accumulation with independent flush:

# Abstract algorithm
for each env i in 1..N:
    returns[i] += rewards[i]
    lengths[i] += 1
    if done[i]:
        return_queue.append(returns[i])
        length_queue.append(lengths[i])
        returns[i] = 0
        lengths[i] = 0

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment