Implementation:Farama Foundation Gymnasium Play

Knowledge Sources	Farama_Foundation_Gymnasium Gymnasium Docs
Domains	Reinforcement_Learning, Interactive_Visualization
Last Updated	2026-02-15 03:00 GMT

Overview

Provides interactive human play capabilities for Gymnasium environments through keyboard input using PyGame, with optional live metric plotting via matplotlib.

Description

The play module enables humans to interact with Gymnasium environments through keyboard controls rendered via PyGame. It contains three main components:

PlayableGame is a class that wraps an environment and manages a PyGame window, tracking pressed keys and translating them into environment actions. It requires the environment to use rgb_array or rgb_array_list render mode and supports optional zoom on the rendered output. It processes PyGame events including key presses, key releases, window resize, and quit events.

play() is the main function that runs the interactive game loop. It accepts a keys_to_action dictionary mapping keyboard combinations to actions, supports multiple key encoding formats (tuples of ints, tuples of characters, or strings), and runs at either the environment's declared render_fps or a user-specified FPS. A callback function can be provided to execute after every step. The wait_on_player parameter enables turn-based interaction where the environment only steps when a key is pressed.

PlayPlot is a helper class that provides real-time matplotlib scatter plots of arbitrary metrics during play. It takes a callback function that computes metrics from each environment transition and displays them in live updating plots with a configurable time horizon.

Usage

Use the play() function to manually test and debug environments by interacting with them through the keyboard. Use PlayPlot to visualize reward curves or other metrics in real time during play sessions.

Code Reference

Source Location

Repository: Farama_Foundation_Gymnasium
File: gymnasium/utils/play.py

Signature

def play(
    env: Env,
    transpose: bool | None = True,
    fps: int | None = None,
    zoom: float | None = None,
    callback: Callable | None = None,
    keys_to_action: dict[tuple[str | int, ...] | str | int, ActType] | None = None,
    seed: int | None = None,
    noop: ActType = 0,
    wait_on_player: bool = False,
) -> None

class PlayableGame:
    def __init__(self, env: Env, keys_to_action: dict | None = None, zoom: float | None = None)
    def process_event(self, event: Event)

class PlayPlot:
    def __init__(self, callback: Callable, horizon_timesteps: int, plot_names: list[str])
    def callback(self, obs_t, obs_tp1, action, rew, terminated, truncated, info)

Import

from gymnasium.utils.play import play, PlayPlot, PlayableGame

I/O Contract

Inputs

Name	Type	Required	Description
env	gymnasium.Env	Yes	The environment to play (must use rgb_array render mode)
transpose	bool or None	No	Whether to transpose the observation display (default True)
fps	int or None	No	Maximum steps per second (default from env metadata or 30)
zoom	float or None	No	Zoom factor for the rendered output
callback	Callable or None	No	Function called after each step with transition data
keys_to_action	dict or None	No	Mapping from key combinations to actions
seed	int or None	No	Random seed for env.reset()
noop	ActType	No	Default action when no key is pressed (default 0)
wait_on_player	bool	No	If True, wait for player input before stepping (default False)

Outputs

Name	Type	Description
(none)	None	The play() function runs the interactive loop until the window is closed

Usage Examples

import gymnasium as gym
from gymnasium.utils.play import play

# Basic usage with default key mapping
env = gym.make("ALE/Pong-v5", render_mode="rgb_array")
play(env, zoom=3)

# Custom key-to-action mapping
import numpy as np
play(
    gym.make("CarRacing-v3", render_mode="rgb_array"),
    keys_to_action={
        "w": np.array([0, 0.7, 0], dtype=np.float32),
        "a": np.array([-1, 0, 0], dtype=np.float32),
        "s": np.array([0, 0, 1], dtype=np.float32),
        "d": np.array([1, 0, 0], dtype=np.float32),
    },
    noop=np.array([0, 0, 0], dtype=np.float32),
)

# With live reward plotting
from gymnasium.utils.play import PlayPlot
def callback(obs_t, obs_tp1, action, rew, terminated, truncated, info):
    return [rew]
plotter = PlayPlot(callback, 150, ["reward"])
play(gym.make("CartPole-v1", render_mode="rgb_array"), callback=plotter.callback)

Related Pages

Environment:Farama_Foundation_Gymnasium_Python_3_10_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment