Principle:Farama Foundation Gymnasium Action Transformation
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Action_Space_Engineering |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Wrappers that transform, clip, rescale, or discretize actions before forwarding them to the underlying environment enable flexible action space engineering.
Description
Action transformation wrappers modify the agent's actions before they reach the environment's step function. This layer of indirection decouples the action representation seen by the learning algorithm from the action representation expected by the environment. Common transformations include clipping continuous actions to valid bounds, rescaling actions from a normalized range to the environment's native range, discretizing continuous action spaces into finite sets, and applying arbitrary user-defined functions.
The clipping transformation ensures that actions produced by the policy (which may exceed the environment bounds due to Gaussian exploration noise or unbounded policy outputs) are clamped to the valid range before execution. The rescaling transformation maps actions from one bounded range to another, which is useful when the learning algorithm assumes a standard action range (such as [-1, 1]) but the environment expects a different range. Discretization converts a continuous Box action space into a Discrete or MultiDiscrete space, enabling discrete RL algorithms to control continuous environments by selecting from a finite grid of actions.
Additionally, the sticky action wrapper introduces stochastic action repetition, where there is a probability that the previous action is repeated instead of the new one. This was proposed as a way to increase environment stochasticity for Atari games and can also model actuator lag or communication delays. Both single-environment and vectorized versions of action wrappers are provided.
Usage
Use action clipping when the policy may produce out-of-bounds actions and you want to enforce validity without modifying the policy. Use action rescaling when adapting a policy trained with one action range to an environment that expects a different range. Use discretization when applying discrete RL algorithms (DQN, etc.) to continuous control tasks. Use the transform action wrapper for custom action preprocessing (e.g., adding offsets, applying non-linear mappings). Use the sticky action wrapper to increase environment stochasticity or simulate actuator delays.
Theoretical Basis
Action transformation wrappers implement the composition pattern where the modified MDP has a transformed action space:
where is the transformation function and is the agent's action. The environment then executes .
Clipping:
Rescaling from to :
Discretization of a -dimensional continuous space into bins per dimension:
Sticky action with repeat probability :
def action(self, action):
if np_random.uniform() < repeat_probability:
return self.last_action # repeat previous action
else:
self.last_action = action
return action
The discretized action space has total actions for a -dimensional space with bins, or for per-dimension bin counts. The wrapper converts a single integer index back to the corresponding continuous action via multi-dimensional unraveling.