Overview
Utility functions and classes used by Gymnasium wrappers, including RunningMeanStd for tracking statistics, create_zero_array for generating zero-valued observations, and rescale_box for affine rescaling of Box spaces.
Description
This module provides shared utility functions used across multiple wrapper implementations:
- RunningMeanStd -- A class that tracks the running mean, variance, and count of values using Welford's parallel algorithm. Used by NormalizeObservation and NormalizeReward wrappers. Initialized with a configurable shape, epsilon (for numerical stability), and dtype.
- update_mean_var_count_from_moments -- A standalone function implementing the parallel mean/variance update algorithm. Computes the new mean, variance, and count from existing statistics and a new batch.
- create_zero_array -- A singledispatch function that creates a "zero-like" sample from any Gymnasium space. Unlike
create_empty_array, this ensures the result is a valid sample from the space by respecting Box bounds. Supports Box, Discrete, MultiDiscrete, MultiBinary, Tuple, Dict, Sequence, Text, Graph, and OneOf spaces.
- rescale_box -- Computes the affine transformation needed to rescale a Box space to new bounds [new_min, new_max]. Returns the new Box, a forward transform function (original to rescaled), and a backward transform function (rescaled to original). Handles finite and infinite bounds correctly.
Usage
These utilities are internal building blocks for the wrapper system. RunningMeanStd is used by normalization wrappers, create_zero_array is used by wrappers needing default/zero observations, and rescale_box is used by RescaleAction and RescaleObservation wrappers.
Code Reference
Source Location
Signature
class RunningMeanStd:
def __init__(self, epsilon=1e-4, shape=(), dtype=np.float64): ...
def update(self, x): ...
def update_from_moments(self, batch_mean, batch_var, batch_count): ...
def update_mean_var_count_from_moments(mean, var, count, batch_mean, batch_var, batch_count): ...
def create_zero_array(space: Space[T_cov]) -> T_cov: ...
def rescale_box(
box: Box,
new_min: np.floating | np.integer | np.ndarray,
new_max: np.floating | np.integer | np.ndarray,
) -> tuple[Box, Callable[[np.ndarray], np.ndarray], Callable[[np.ndarray], np.ndarray]]: ...
Import
from gymnasium.wrappers.utils import RunningMeanStd, create_zero_array, update_mean_var_count_from_moments, rescale_box
I/O Contract
RunningMeanStd
| Name |
Type |
Required |
Description
|
| epsilon |
float |
No |
Small constant for numerical stability (default 1e-4)
|
| shape |
tuple |
No |
Shape of the tracked values (default ())
|
| dtype |
numpy dtype |
No |
Data type for mean/var arrays (default np.float64)
|
create_zero_array
| Name |
Type |
Required |
Description
|
| space |
Space |
Yes |
The Gymnasium space to create a zero-valued sample for
|
rescale_box
| Name |
Type |
Required |
Description
|
| box |
Box |
Yes |
The Box space to rescale
|
| new_min |
float, int, or ndarray |
Yes |
The new minimum bound
|
| new_max |
float, int, or ndarray |
Yes |
The new maximum bound
|
Usage Examples
import numpy as np
from gymnasium.spaces import Box, Dict
from gymnasium.wrappers.utils import RunningMeanStd, create_zero_array, rescale_box
# RunningMeanStd: track running statistics
rms = RunningMeanStd(shape=(4,))
for _ in range(100):
rms.update(np.random.randn(32, 4))
print(rms.mean, rms.var)
# create_zero_array: generate zero-valued samples
space = Box(low=-1.0, high=1.0, shape=(3,))
zero = create_zero_array(space)
# array([0., 0., 0.])
space = Dict({"a": Box(low=2.0, high=5.0, shape=(2,))})
zero = create_zero_array(space)
# {'a': array([2., 2.])} -- respects lower bound
# rescale_box: compute affine transformation
box = Box(low=-1.0, high=1.0, shape=(2,), dtype=np.float32)
new_box, forward, backward = rescale_box(box, new_min=0.0, new_max=1.0)
# new_box == Box(0.0, 1.0, (2,), float32)
# forward(np.array([-1.0, 1.0])) == array([0.0, 1.0])
Related Pages