Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Farama Foundation Gymnasium Wrapper Utils

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Wrappers
Last Updated 2026-02-15 03:00 GMT

Overview

Utility functions and classes used by Gymnasium wrappers, including RunningMeanStd for tracking statistics, create_zero_array for generating zero-valued observations, and rescale_box for affine rescaling of Box spaces.

Description

This module provides shared utility functions used across multiple wrapper implementations:

  • RunningMeanStd -- A class that tracks the running mean, variance, and count of values using Welford's parallel algorithm. Used by NormalizeObservation and NormalizeReward wrappers. Initialized with a configurable shape, epsilon (for numerical stability), and dtype.
  • update_mean_var_count_from_moments -- A standalone function implementing the parallel mean/variance update algorithm. Computes the new mean, variance, and count from existing statistics and a new batch.
  • create_zero_array -- A singledispatch function that creates a "zero-like" sample from any Gymnasium space. Unlike create_empty_array, this ensures the result is a valid sample from the space by respecting Box bounds. Supports Box, Discrete, MultiDiscrete, MultiBinary, Tuple, Dict, Sequence, Text, Graph, and OneOf spaces.
  • rescale_box -- Computes the affine transformation needed to rescale a Box space to new bounds [new_min, new_max]. Returns the new Box, a forward transform function (original to rescaled), and a backward transform function (rescaled to original). Handles finite and infinite bounds correctly.

Usage

These utilities are internal building blocks for the wrapper system. RunningMeanStd is used by normalization wrappers, create_zero_array is used by wrappers needing default/zero observations, and rescale_box is used by RescaleAction and RescaleObservation wrappers.

Code Reference

Source Location

Signature

class RunningMeanStd:
    def __init__(self, epsilon=1e-4, shape=(), dtype=np.float64): ...
    def update(self, x): ...
    def update_from_moments(self, batch_mean, batch_var, batch_count): ...

def update_mean_var_count_from_moments(mean, var, count, batch_mean, batch_var, batch_count): ...

def create_zero_array(space: Space[T_cov]) -> T_cov: ...

def rescale_box(
    box: Box,
    new_min: np.floating | np.integer | np.ndarray,
    new_max: np.floating | np.integer | np.ndarray,
) -> tuple[Box, Callable[[np.ndarray], np.ndarray], Callable[[np.ndarray], np.ndarray]]: ...

Import

from gymnasium.wrappers.utils import RunningMeanStd, create_zero_array, update_mean_var_count_from_moments, rescale_box

I/O Contract

RunningMeanStd

Name Type Required Description
epsilon float No Small constant for numerical stability (default 1e-4)
shape tuple No Shape of the tracked values (default ())
dtype numpy dtype No Data type for mean/var arrays (default np.float64)

create_zero_array

Name Type Required Description
space Space Yes The Gymnasium space to create a zero-valued sample for

rescale_box

Name Type Required Description
box Box Yes The Box space to rescale
new_min float, int, or ndarray Yes The new minimum bound
new_max float, int, or ndarray Yes The new maximum bound

Usage Examples

import numpy as np
from gymnasium.spaces import Box, Dict
from gymnasium.wrappers.utils import RunningMeanStd, create_zero_array, rescale_box

# RunningMeanStd: track running statistics
rms = RunningMeanStd(shape=(4,))
for _ in range(100):
    rms.update(np.random.randn(32, 4))
print(rms.mean, rms.var)

# create_zero_array: generate zero-valued samples
space = Box(low=-1.0, high=1.0, shape=(3,))
zero = create_zero_array(space)
# array([0., 0., 0.])

space = Dict({"a": Box(low=2.0, high=5.0, shape=(2,))})
zero = create_zero_array(space)
# {'a': array([2., 2.])}  -- respects lower bound

# rescale_box: compute affine transformation
box = Box(low=-1.0, high=1.0, shape=(2,), dtype=np.float32)
new_box, forward, backward = rescale_box(box, new_min=0.0, new_max=1.0)
# new_box == Box(0.0, 1.0, (2,), float32)
# forward(np.array([-1.0, 1.0])) == array([0.0, 1.0])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment