Implementation:Farama Foundation Gymnasium Transform Reward Wrappers

Knowledge Sources	Farama_Foundation_Gymnasium Gymnasium Docs
Domains	Reinforcement_Learning, Wrappers
Last Updated	2026-02-15 03:00 GMT

Overview

A collection of reward transformation wrappers that modify the reward returned by the environment using user-defined functions, including TransformReward and ClipReward.

Description

This module provides two reward wrappers:

TransformReward -- A general-purpose reward wrapper that applies a user-provided callable to the reward returned by the environment's step function. The function takes a reward value (SupportsFloat) and returns a transformed reward. This is the base class for other reward transformations.

ClipReward -- A subclass of TransformReward that clips rewards between a minimum and maximum bound using np.clip. At least one of min_reward or max_reward must be provided, and min_reward must be less than or equal to max_reward.

Both wrappers have vector versions available in gymnasium.wrappers.vector.

Usage

Use TransformReward for custom reward shaping (e.g., scaling, shifting, or non-linear transformations). Use ClipReward to bound rewards to a fixed range, which can help stabilize training when reward magnitudes vary significantly.

Code Reference

Source Location

Repository: Farama_Foundation_Gymnasium
File: gymnasium/wrappers/transform_reward.py

Signature

class TransformReward(gym.RewardWrapper[ObsType, ActType], gym.utils.RecordConstructorArgs):
    def __init__(self, env: gym.Env[ObsType, ActType], func: Callable[[SupportsFloat], SupportsFloat]): ...

class ClipReward(TransformReward[ObsType, ActType], gym.utils.RecordConstructorArgs):
    def __init__(self, env: gym.Env[ObsType, ActType], min_reward: float | np.ndarray | None = None, max_reward: float | np.ndarray | None = None): ...

Import

from gymnasium.wrappers import TransformReward, ClipReward

I/O Contract

Inputs

Name	Type	Required	Description
env	Env	Yes	The environment to wrap
func	Callable	Yes (TransformReward)	Function to apply to the reward
min_reward	float, ndarray, or None	No (ClipReward)	Lower bound for reward clipping
max_reward	float, ndarray, or None	No (ClipReward)	Upper bound for reward clipping

Outputs

Name	Type	Description
reward	SupportsFloat	The transformed or clipped reward value

Usage Examples

import gymnasium as gym
from gymnasium.wrappers import TransformReward, ClipReward

# TransformReward: double the reward and add 1
env = gym.make("CartPole-v1")
env = TransformReward(env, lambda r: 2 * r + 1)
_ = env.reset()
_, rew, _, _, _ = env.step(0)
# rew == 3.0

# ClipReward: clip rewards between 0 and 0.5
env = gym.make("CartPole-v1")
env = ClipReward(env, 0, 0.5)
_ = env.reset()
_, rew, _, _, _ = env.step(1)
# rew == 0.5

Related Pages

Environment:Farama_Foundation_Gymnasium_Python_3_10_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment