Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Farama Foundation Gymnasium Vector Reward Wrappers

From Leeroopedia
Revision as of 12:38, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Farama_Foundation_Gymnasium_Vector_Reward_Wrappers.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Reinforcement_Learning, Wrappers
Last Updated 2026-02-15 03:00 GMT

Overview

A collection of vectorized reward wrappers that transform rewards for VectorEnv environments, including TransformReward, VectorizeTransformReward, and ClipReward.

Description

This module provides vectorized versions of reward transformation wrappers:

  • TransformReward -- Applies a user-provided function to the entire reward array at once. The function receives an array of rewards (one per sub-environment) and should return a transformed array. This is the preferred approach when the transformation can operate on vectors directly.
  • VectorizeTransformReward -- Wraps a single-agent reward wrapper to work with vector environments by iterating over individual rewards and applying the wrapper's function. Uses a bare Env() instance internally to initialize the single-agent wrapper. Serves as the base class for ClipReward.
  • ClipReward -- Clips rewards between a minimum and maximum value. A thin wrapper around VectorizeTransformReward using the single-agent ClipReward wrapper.

Usage

Use these wrappers to transform rewards in vectorized environments. TransformReward is preferred for custom transformations that can operate on the full reward vector. ClipReward is used for bounding reward magnitudes to a fixed range.

Code Reference

Source Location

Signature

class TransformReward(VectorRewardWrapper):
    def __init__(self, env: VectorEnv, func: Callable[[ArrayType], ArrayType]): ...

class VectorizeTransformReward(VectorRewardWrapper):
    def __init__(self, env: VectorEnv, wrapper: type[transform_reward.TransformReward], **kwargs: Any): ...

class ClipReward(VectorizeTransformReward):
    def __init__(self, env: VectorEnv, min_reward: float | np.ndarray | None = None,
                 max_reward: float | np.ndarray | None = None): ...

Import

from gymnasium.wrappers.vector import TransformReward, VectorizeTransformReward, ClipReward

I/O Contract

Inputs

Name Type Required Description
env VectorEnv Yes The vector environment to wrap
func Callable Yes (TransformReward) Function to apply to the reward array
wrapper type Yes (VectorizeTransformReward) The single-agent wrapper class to vectorize
min_reward float, ndarray, or None No (ClipReward) Lower bound for reward clipping
max_reward float, ndarray, or None No (ClipReward) Upper bound for reward clipping

Outputs

Name Type Description
rewards ArrayType Transformed reward array (one value per sub-environment)

Usage Examples

import numpy as np
import gymnasium as gym
from gymnasium.wrappers.vector import TransformReward, ClipReward

# TransformReward: scale and shift rewards
envs = gym.make_vec("MountainCarContinuous-v0", num_envs=3)
envs = TransformReward(envs, func=lambda rew: (rew - 1.0) * 2.0)
_ = envs.action_space.seed(123)
obs, info = envs.reset(seed=123)
obs, rew, term, trunc, info = envs.step(envs.action_space.sample())
envs.close()

# ClipReward: clip rewards between 0 and 2
envs = gym.make_vec("MountainCarContinuous-v0", num_envs=3)
envs = ClipReward(envs, 0.0, 2.0)
_ = envs.action_space.seed(123)
obs, info = envs.reset(seed=123)
for _ in range(10):
    obs, rew, term, trunc, info = envs.step(0.5 * np.ones((3, 1)))
envs.close()
# rew is clipped to [0.0, 2.0]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment