Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Farama Foundation Gymnasium CarRacing

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Box2D_Environments
Last Updated 2026-02-15 03:00 GMT

Overview

CarRacing is a top-down car racing reinforcement learning environment where an agent drives a rear-wheel-drive car around a procedurally generated closed-loop track, learning from 96x96 RGB pixel observations.

Description

The CarRacing class implements a Gymnasium environment (gym.Env) that presents the agent with a top-down view of a car navigating a randomly generated race track. It is described as "the easiest control task to learn from pixels" and is a widely used benchmark for vision-based reinforcement learning. The environment uses the Box2D physics engine for world simulation and pygame for rendering. Each episode generates a new track by creating checkpoints around a deformed circle, connecting them with smooth curves, and laying down road tiles with friction properties. A FrictionDetector contact listener handles tire-road interactions, tracking which tiles the car visits and updating rewards accordingly.

The environment supports both continuous and discrete action spaces. In continuous mode, the agent provides three values: steering [-1, 1], gas [0, 1], and brake [0, 1]. In discrete mode, the agent selects from five actions: do nothing, steer right, steer left, gas, or brake. The car physics are delegated to the Car class from car_dynamics.py, which models realistic rear-wheel drive, steering dynamics, and friction. The observation is a 96x96 RGB image rendered from the top-down camera that follows and rotates with the car, with an animated zoom-in effect at the start of each episode.

The environment also supports domain randomization: when enabled, background and track colors are randomized on each reset, which can be controlled per-reset via the options dictionary. Visual indicators at the bottom of the rendered window show true speed, ABS sensor readings for each wheel, steering wheel position, and gyroscope data. The episode terminates when all track tiles have been visited (lap complete), when a configurable percentage of tiles have been visited and the car crosses the start line, or when the car leaves the playfield boundary.

Usage

Use CarRacing as a benchmark for training vision-based reinforcement learning agents. It is suitable for evaluating convolutional neural network policies that must learn driving behavior directly from pixel observations. The continuous action variant works well with algorithms such as PPO, SAC, and DDPG, while the discrete variant can be used with DQN and similar methods. Domain randomization support makes it useful for sim-to-sim transfer experiments.

Code Reference

Source Location

Signature

class CarRacing(gym.Env, EzPickle):
    def __init__(
        self,
        render_mode: str | None = None,
        verbose: bool = False,
        lap_complete_percent: float = 0.95,
        domain_randomize: bool = False,
        continuous: bool = True,
    )

Import

import gymnasium as gym

# Via gym.make (recommended)
env = gym.make("CarRacing-v3")

# Direct import
from gymnasium.envs.box2d.car_racing import CarRacing

I/O Contract

Inputs

Constructor Parameters:

Name Type Required Description
render_mode str or None No Rendering mode; one of "human" (opens a window), "rgb_array" (returns 600x400 pixel array), "state_pixels" (returns 96x96 pixel array), or None. Default is None.
verbose bool No If True, prints track generation debug information. Default is False.
lap_complete_percent float No Fraction of track tiles that must be visited before a lap is considered complete (when the car crosses tile index 0 again). Default is 0.95.
domain_randomize bool No If True, randomizes background and track colors on each reset. Default is False.
continuous bool No If True, uses continuous action space (Box). If False, uses discrete action space (Discrete(5)). Default is True.

Action Space (step input):

Mode Type Description
Continuous np.ndarray (shape (3,), float32) [steering (-1 to +1), gas (0 to 1), brake (0 to 1)]. Steering is negated internally (negative action steers right).
Discrete int (0 to 4) 0: do nothing, 1: steer right, 2: steer left, 3: gas, 4: brake.

Reset Parameters:

Name Type Required Description
seed int or None No Random seed for reproducibility.
options dict or None No If domain_randomize is True, pass {"randomize": True} or {"randomize": False} to control color randomization per reset.

Outputs

Observation Space:

Name Type Description
observation np.ndarray (shape (96, 96, 3), uint8) Top-down RGB image of the car and track rendered at 96x96 pixels. Values range from 0 to 255.

Step Return:

Name Type Description
observation np.ndarray (shape (96, 96, 3), uint8) The 96x96 RGB pixel observation of the current state.
reward float Step reward: -0.1 per frame plus +1000/N for each newly visited track tile (N = total tiles). Receives -100 if the car leaves the playfield.
terminated bool True if all tiles visited, lap complete percentage met at start tile, or car exits the playfield boundary.
truncated bool Always False (time limit handled by TimeLimit wrapper).
info dict Contains "lap_finished" key (bool) when terminated: True if lap was completed, False if car went out of bounds.

Key Methods

Method Description
reset(seed, options) Destroys the previous world state, optionally re-randomizes colors, generates a new random track via _create_track(), creates a Car instance at the track start, and returns the initial pixel observation.
step(action) Applies steering, gas, and brake to the car, advances car physics and the Box2D world by 1/FPS seconds, renders the state_pixels observation, computes step reward (tile visits minus time penalty), and checks termination conditions.
render() Dispatches to _render() with the configured render_mode. Returns None for "human" mode, pixel array for "rgb_array".
close() Closes the pygame display window and releases resources.
_create_track() Procedurally generates a closed-loop race track by placing checkpoints, smoothly interpolating between them, finding the closed loop, creating road tile bodies with friction sensors, and adding red-white borders on sharp turns. Returns False on generation failure (retried automatically).
_render(mode) Core rendering: computes camera zoom and rotation to follow the car, draws the road and grass via _render_road(), draws the car via Car.draw(), flips the surface, overlays dashboard indicators, and outputs to screen or returns pixel array.
_render_road(zoom, translation, angle) Draws the background, grass patches, and road polygons with coordinate transformation.
_render_indicators(W, H) Draws the dashboard at the bottom of the window: speed, ABS sensors, steering angle, and gyroscope.
_destroy() Removes all road tile bodies from the Box2D world and destroys the car.
_init_colors() Sets initial road, background, and grass colors (random if domain_randomize, otherwise defaults).
_reinit_colors(randomize) Re-randomizes colors during reset when domain_randomize is enabled.

Track Generation

The track is generated procedurally each episode:

  1. 12 checkpoints are placed around a circle of radius TRACK_RAD with random perturbation in angle and radius.
  2. A path is traced from checkpoint to checkpoint using smooth turning (TRACK_TURN_RATE) and step size (TRACK_DETAIL_STEP).
  3. The closed loop is extracted by finding where the path crosses the start angle twice.
  4. Road tiles are created as Box2D static bodies with friction sensors. Each tile tracks whether it has been visited.
  5. Sharp turns receive red-and-white border markings for visual clarity.

Constants

Constant Value Description
STATE_W, STATE_H 96, 96 Observation image dimensions in pixels.
VIDEO_W, VIDEO_H 600, 400 RGB array render dimensions.
WINDOW_W, WINDOW_H 1000, 800 Human display window dimensions.
FPS 50 Simulation frames per second.
ZOOM 2.7 Camera zoom level (after initial animation).
TRACK_RAD 900 / SCALE Base track radius.
PLAYFIELD 2000 / SCALE Game-over boundary distance from origin.
SCALE 6.0 World-to-pixel scale factor.

Usage Examples

import gymnasium as gym
import numpy as np

# Create CarRacing with continuous actions
env = gym.make("CarRacing-v3", render_mode="rgb_array")
observation, info = env.reset(seed=42)
print(f"Observation shape: {observation.shape}")  # (96, 96, 3)

# Run a simple loop
total_reward = 0.0
for step in range(200):
    # Continuous action: [steering, gas, brake]
    action = np.array([0.0, 0.3, 0.0], dtype=np.float32)
    observation, reward, terminated, truncated, info = env.step(action)
    total_reward += reward

    if terminated or truncated:
        print(f"Episode ended at step {step}, total reward: {total_reward:.1f}")
        print(f"Lap finished: {info.get('lap_finished', 'N/A')}")
        break

env.close()

# Discrete action variant
env_discrete = gym.make(
    "CarRacing-v3",
    continuous=False,
    render_mode="rgb_array",
)
observation, info = env_discrete.reset(seed=42)

for step in range(200):
    action = env_discrete.action_space.sample()  # Random int in {0,1,2,3,4}
    observation, reward, terminated, truncated, info = env_discrete.step(action)

    if terminated or truncated:
        break

env_discrete.close()

# Domain randomization
env_rand = gym.make("CarRacing-v3", domain_randomize=True, render_mode="rgb_array")
obs1, _ = env_rand.reset()                                  # random colors
obs2, _ = env_rand.reset(options={"randomize": False})       # keep same colors
obs3, _ = env_rand.reset(options={"randomize": True})        # new random colors
env_rand.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment