Implementation:Farama Foundation Gymnasium Transform Reward Wrappers
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Wrappers |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
A collection of reward transformation wrappers that modify the reward returned by the environment using user-defined functions, including TransformReward and ClipReward.
Description
This module provides two reward wrappers:
- TransformReward -- A general-purpose reward wrapper that applies a user-provided callable to the reward returned by the environment's step function. The function takes a reward value (SupportsFloat) and returns a transformed reward. This is the base class for other reward transformations.
- ClipReward -- A subclass of TransformReward that clips rewards between a minimum and maximum bound using
np.clip. At least one ofmin_rewardormax_rewardmust be provided, andmin_rewardmust be less than or equal tomax_reward.
Both wrappers have vector versions available in gymnasium.wrappers.vector.
Usage
Use TransformReward for custom reward shaping (e.g., scaling, shifting, or non-linear transformations). Use ClipReward to bound rewards to a fixed range, which can help stabilize training when reward magnitudes vary significantly.
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File:
gymnasium/wrappers/transform_reward.py
Signature
class TransformReward(gym.RewardWrapper[ObsType, ActType], gym.utils.RecordConstructorArgs):
def __init__(self, env: gym.Env[ObsType, ActType], func: Callable[[SupportsFloat], SupportsFloat]): ...
class ClipReward(TransformReward[ObsType, ActType], gym.utils.RecordConstructorArgs):
def __init__(self, env: gym.Env[ObsType, ActType], min_reward: float | np.ndarray | None = None, max_reward: float | np.ndarray | None = None): ...
Import
from gymnasium.wrappers import TransformReward, ClipReward
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| env | Env | Yes | The environment to wrap |
| func | Callable | Yes (TransformReward) | Function to apply to the reward |
| min_reward | float, ndarray, or None | No (ClipReward) | Lower bound for reward clipping |
| max_reward | float, ndarray, or None | No (ClipReward) | Upper bound for reward clipping |
Outputs
| Name | Type | Description |
|---|---|---|
| reward | SupportsFloat | The transformed or clipped reward value |
Usage Examples
import gymnasium as gym
from gymnasium.wrappers import TransformReward, ClipReward
# TransformReward: double the reward and add 1
env = gym.make("CartPole-v1")
env = TransformReward(env, lambda r: 2 * r + 1)
_ = env.reset()
_, rew, _, _, _ = env.step(0)
# rew == 3.0
# ClipReward: clip rewards between 0 and 0.5
env = gym.make("CartPole-v1")
env = ClipReward(env, 0, 0.5)
_ = env.reset()
_, rew, _, _, _ = env.step(1)
# rew == 0.5
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment