Implementation:Farama Foundation Gymnasium RecordEpisodeStatistics
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Monitoring |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Concrete tool for tracking cumulative episode rewards, lengths, and timing provided by the Gymnasium library.
Description
The RecordEpisodeStatistics wrapper intercepts step and reset calls to accumulate episode-level metrics. At the end of each episode (when terminated or truncated is True), it injects an "episode" key into the info dictionary containing the cumulative reward (r), episode length (l), and wall-clock time (t). It also maintains return_queue, length_queue, and time_queue deque attributes for accessing recent episode statistics.
Usage
Wrap any Gymnasium environment with this wrapper when you need to track training progress. Access env.return_queue and env.length_queue for plotting learning curves or computing rolling averages.
Code Reference
Source Location
- Repository: Gymnasium
- File: gymnasium/wrappers/common.py
- Lines: L436-548
Signature
class RecordEpisodeStatistics(gym.Wrapper[ObsType, ActType, ObsType, ActType]):
def __init__(
self,
env: gym.Env[ObsType, ActType],
buffer_length: int = 100,
stats_key: str = "episode",
):
"""Tracks cumulative rewards and episode lengths.
Args:
env: The environment to apply the wrapper
buffer_length: Size of the return_queue, length_queue, and time_queue buffers
stats_key: The info key for the episode statistics
"""
Import
from gymnasium.wrappers import RecordEpisodeStatistics
# or
import gymnasium as gym
env = gym.wrappers.RecordEpisodeStatistics(env)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| env | gym.Env | Yes | The environment to wrap |
| buffer_length | int | No | Size of deque buffers (default 100) |
| stats_key | str | No | Info dict key for statistics (default "episode") |
Outputs
| Name | Type | Description |
|---|---|---|
| return_queue | deque[float] | Last N episode cumulative rewards |
| length_queue | deque[int] | Last N episode lengths |
| time_queue | deque[float] | Last N episode wall-clock durations |
| info["episode"] | dict | Contains "r" (reward), "l" (length), "t" (time) at episode end |
Usage Examples
Basic Statistics Tracking
import gymnasium as gym
from gymnasium.wrappers import RecordEpisodeStatistics
env = gym.make("CartPole-v1")
env = RecordEpisodeStatistics(env, buffer_length=100)
obs, info = env.reset(seed=42)
for episode in range(100):
terminated, truncated = False, False
while not (terminated or truncated):
action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
if terminated or truncated:
# info["episode"] contains {"r": ..., "l": ..., "t": ...}
print(f"Episode {episode}: reward={info['episode']['r']}")
obs, info = env.reset()
# Access rolling statistics
print(f"Mean return: {sum(env.return_queue) / len(env.return_queue)}")