Principle:Farama Foundation Gymnasium Vector Episode Statistics
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Monitoring |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
An extension of episode statistics tracking to vectorized environments that monitors per-environment cumulative rewards and episode lengths across parallel sub-environments.
Description
Vector Episode Statistics extends the single-environment episode tracking pattern to batched settings. Each sub-environment's cumulative reward and episode length is tracked independently. When any sub-environment completes an episode, its statistics are recorded in the shared buffers.
The key difference from single-environment tracking is the handling of partial completion: in a batch of N environments, only some may finish in any given step. The wrapper maintains per-environment accumulators and flushes statistics as each individual environment reaches episode boundaries.
Usage
Use this for monitoring training progress in vectorized A2C, PPO, or other batched RL algorithms. The return_queue and length_queue provide rolling statistics across all sub-environments.
Theoretical Basis
Per-environment accumulation with independent flush:
# Abstract algorithm
for each env i in 1..N:
returns[i] += rewards[i]
lengths[i] += 1
if done[i]:
return_queue.append(returns[i])
length_queue.append(lengths[i])
returns[i] = 0
lengths[i] = 0