Implementation:Farama Foundation Gymnasium MountainCarEnv

Knowledge Sources	Farama_Foundation_Gymnasium Gymnasium Docs
Domains	Reinforcement_Learning, Classic_Control
Last Updated	2026-02-15 03:00 GMT

Overview

Concrete tool for the MountainCar classic control environment provided by Gymnasium.

Description

The Mountain Car MDP is a deterministic MDP that consists of a car placed stochastically at the bottom of a sinusoidal valley, with the only possible actions being discrete accelerations that can be applied to the car in either direction. The goal of the MDP is to strategically accelerate the car to reach the goal state on top of the right hill. This is the discrete-action variant of the Mountain Car domain.

The transition dynamics update velocity as: velocity_new = velocity + (action - 1) * force - cos(3 * position) * gravity, where force = 0.001 and gravity = 0.0025. The position is then updated as: position_new = position + velocity_new. Collisions at either boundary are inelastic with velocity set to 0 upon collision with the left wall. Position is clipped to [-1.2, 0.6] and velocity is clipped to [-0.07, 0.07].

This MDP first appeared in Andrew Moore's PhD Thesis (1990) from the University of Cambridge. The car's engine is too weak to directly climb the hill, so the agent must learn to leverage gravity by building momentum through oscillation. The environment terminates when the car reaches position >= 0.5 with velocity >= goal_velocity (default 0), and each non-terminal step incurs a reward of -1.

Usage

This environment is commonly used as a benchmark for reinforcement learning algorithms, particularly those dealing with sparse reward problems and discrete action spaces. It is one of the classic test environments for evaluating exploration strategies since the agent receives -1 reward at every step and must discover the goal through undirected exploration. The environment is well-suited for Q-learning, SARSA, Monte Carlo methods, and policy gradient algorithms. It also serves as an important educational tool for demonstrating how simple physics-based environments can pose challenges for naive RL approaches.

Code Reference

Source Location

Repository: Farama_Foundation_Gymnasium
File: gymnasium/envs/classic_control/mountain_car.py

Signature

class MountainCarEnv(gym.Env):
    def __init__(self, render_mode: str | None = None, goal_velocity=0):

Import

import gymnasium as gym
env = gym.make("MountainCar-v0")

I/O Contract

Inputs

Name	Type	Required	Description
action	int	Yes	Discrete action in {0, 1, 2}: accelerate left (0), no acceleration (1), accelerate right (2)

Outputs

Name	Type	Description
observation	np.ndarray (shape (2,), float32)	[position, velocity]
reward	float	-1.0 at every timestep
terminated	bool	True when position >= 0.5 and velocity >= goal_velocity
truncated	bool	False (truncation handled by TimeLimit wrapper; default 200 steps)
info	dict	Empty dictionary

Observation Space Details

Index	Observation	Min	Max	Unit
0	Position of the car along the x-axis	-1.2	0.6	position (m)
1	Velocity of the car	-0.07	0.07	velocity (m/s)

Action Space Details

Value	Action
0	Accelerate to the left
1	Don't accelerate
2	Accelerate to the right

Key Methods

Method	Description
`__init__(render_mode=None, goal_velocity=0)`	Initializes the environment with observation space Box(2,), action space Discrete(3), physics parameters, and optional goal velocity
`reset(seed=None, options=None)`	Resets position to random value in [-0.6, -0.4] with velocity=0 (customizable via options "low"/"high"); returns (observation, info)
`step(action)`	Applies discrete acceleration, updates velocity and position, checks termination, and returns (observation, reward, terminated, truncated, info)
`render()`	Renders the environment using pygame in "human" or "rgb_array" mode, showing the sinusoidal valley, car, and goal flag
`get_keys_to_action()`	Returns a mapping from keyboard keys to actions for human play (left arrow=0, right arrow=2, no key=1)
`close()`	Closes the pygame display and cleans up resources

Physics Parameters

Parameter	Value	Description
min_position	-1.2	Minimum car position (left boundary)
max_position	0.6	Maximum car position (right boundary)
max_speed	0.07	Maximum car velocity
goal_position	0.5	Target position for termination
goal_velocity	0 (default)	Minimum velocity at goal for termination
force	0.001	Acceleration force magnitude
gravity	0.0025	Gravity constant affecting car on slope

Usage Examples

import gymnasium as gym

env = gym.make("MountainCar-v0")
observation, info = env.reset(seed=42)

for _ in range(1000):
    action = env.action_space.sample()
    observation, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        observation, info = env.reset()

env.close()

Custom Goal Velocity

import gymnasium as gym

env = gym.make("MountainCar-v0", render_mode="rgb_array", goal_velocity=0.1)
observation, info = env.reset(seed=123)

Related Pages

Environment:Farama_Foundation_Gymnasium_Python_3_10_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment