Implementation:Farama Foundation Gymnasium Play
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Interactive_Visualization |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Provides interactive human play capabilities for Gymnasium environments through keyboard input using PyGame, with optional live metric plotting via matplotlib.
Description
The play module enables humans to interact with Gymnasium environments through keyboard controls rendered via PyGame. It contains three main components:
PlayableGame is a class that wraps an environment and manages a PyGame window, tracking pressed keys and translating them into environment actions. It requires the environment to use rgb_array or rgb_array_list render mode and supports optional zoom on the rendered output. It processes PyGame events including key presses, key releases, window resize, and quit events.
play() is the main function that runs the interactive game loop. It accepts a keys_to_action dictionary mapping keyboard combinations to actions, supports multiple key encoding formats (tuples of ints, tuples of characters, or strings), and runs at either the environment's declared render_fps or a user-specified FPS. A callback function can be provided to execute after every step. The wait_on_player parameter enables turn-based interaction where the environment only steps when a key is pressed.
PlayPlot is a helper class that provides real-time matplotlib scatter plots of arbitrary metrics during play. It takes a callback function that computes metrics from each environment transition and displays them in live updating plots with a configurable time horizon.
Usage
Use the play() function to manually test and debug environments by interacting with them through the keyboard. Use PlayPlot to visualize reward curves or other metrics in real time during play sessions.
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File:
gymnasium/utils/play.py
Signature
def play(
env: Env,
transpose: bool | None = True,
fps: int | None = None,
zoom: float | None = None,
callback: Callable | None = None,
keys_to_action: dict[tuple[str | int, ...] | str | int, ActType] | None = None,
seed: int | None = None,
noop: ActType = 0,
wait_on_player: bool = False,
) -> None
class PlayableGame:
def __init__(self, env: Env, keys_to_action: dict | None = None, zoom: float | None = None)
def process_event(self, event: Event)
class PlayPlot:
def __init__(self, callback: Callable, horizon_timesteps: int, plot_names: list[str])
def callback(self, obs_t, obs_tp1, action, rew, terminated, truncated, info)
Import
from gymnasium.utils.play import play, PlayPlot, PlayableGame
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| env | gymnasium.Env | Yes | The environment to play (must use rgb_array render mode) |
| transpose | bool or None | No | Whether to transpose the observation display (default True) |
| fps | int or None | No | Maximum steps per second (default from env metadata or 30) |
| zoom | float or None | No | Zoom factor for the rendered output |
| callback | Callable or None | No | Function called after each step with transition data |
| keys_to_action | dict or None | No | Mapping from key combinations to actions |
| seed | int or None | No | Random seed for env.reset() |
| noop | ActType | No | Default action when no key is pressed (default 0) |
| wait_on_player | bool | No | If True, wait for player input before stepping (default False) |
Outputs
| Name | Type | Description |
|---|---|---|
| (none) | None | The play() function runs the interactive loop until the window is closed |
Usage Examples
import gymnasium as gym
from gymnasium.utils.play import play
# Basic usage with default key mapping
env = gym.make("ALE/Pong-v5", render_mode="rgb_array")
play(env, zoom=3)
# Custom key-to-action mapping
import numpy as np
play(
gym.make("CarRacing-v3", render_mode="rgb_array"),
keys_to_action={
"w": np.array([0, 0.7, 0], dtype=np.float32),
"a": np.array([-1, 0, 0], dtype=np.float32),
"s": np.array([0, 0, 1], dtype=np.float32),
"d": np.array([1, 0, 0], dtype=np.float32),
},
noop=np.array([0, 0, 0], dtype=np.float32),
)
# With live reward plotting
from gymnasium.utils.play import PlayPlot
def callback(obs_t, obs_tp1, action, rew, terminated, truncated, info):
return [rew]
plotter = PlayPlot(callback, 150, ["reward"])
play(gym.make("CartPole-v1", render_mode="rgb_array"), callback=plotter.callback)