Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Farama Foundation Gymnasium RecordEpisodeStatistics

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Monitoring
Last Updated 2026-02-15 03:00 GMT

Overview

Concrete tool for tracking cumulative episode rewards, lengths, and timing provided by the Gymnasium library.

Description

The RecordEpisodeStatistics wrapper intercepts step and reset calls to accumulate episode-level metrics. At the end of each episode (when terminated or truncated is True), it injects an "episode" key into the info dictionary containing the cumulative reward (r), episode length (l), and wall-clock time (t). It also maintains return_queue, length_queue, and time_queue deque attributes for accessing recent episode statistics.

Usage

Wrap any Gymnasium environment with this wrapper when you need to track training progress. Access env.return_queue and env.length_queue for plotting learning curves or computing rolling averages.

Code Reference

Source Location

  • Repository: Gymnasium
  • File: gymnasium/wrappers/common.py
  • Lines: L436-548

Signature

class RecordEpisodeStatistics(gym.Wrapper[ObsType, ActType, ObsType, ActType]):
    def __init__(
        self,
        env: gym.Env[ObsType, ActType],
        buffer_length: int = 100,
        stats_key: str = "episode",
    ):
        """Tracks cumulative rewards and episode lengths.

        Args:
            env: The environment to apply the wrapper
            buffer_length: Size of the return_queue, length_queue, and time_queue buffers
            stats_key: The info key for the episode statistics
        """

Import

from gymnasium.wrappers import RecordEpisodeStatistics
# or
import gymnasium as gym
env = gym.wrappers.RecordEpisodeStatistics(env)

I/O Contract

Inputs

Name Type Required Description
env gym.Env Yes The environment to wrap
buffer_length int No Size of deque buffers (default 100)
stats_key str No Info dict key for statistics (default "episode")

Outputs

Name Type Description
return_queue deque[float] Last N episode cumulative rewards
length_queue deque[int] Last N episode lengths
time_queue deque[float] Last N episode wall-clock durations
info["episode"] dict Contains "r" (reward), "l" (length), "t" (time) at episode end

Usage Examples

Basic Statistics Tracking

import gymnasium as gym
from gymnasium.wrappers import RecordEpisodeStatistics

env = gym.make("CartPole-v1")
env = RecordEpisodeStatistics(env, buffer_length=100)

obs, info = env.reset(seed=42)

for episode in range(100):
    terminated, truncated = False, False
    while not (terminated or truncated):
        action = env.action_space.sample()
        obs, reward, terminated, truncated, info = env.step(action)
        if terminated or truncated:
            # info["episode"] contains {"r": ..., "l": ..., "t": ...}
            print(f"Episode {episode}: reward={info['episode']['r']}")
            obs, info = env.reset()

# Access rolling statistics
print(f"Mean return: {sum(env.return_queue) / len(env.return_queue)}")

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment