Implementation:Farama Foundation Gymnasium FrozenLakeEnv
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Toy_Text_Environments |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
The Frozen Lake environment where an agent navigates a grid of frozen tiles and holes to reach a goal, with configurable slipperiness, map layouts, and reward schedules, registered as FrozenLake-v1.
Description
The FrozenLakeEnv class implements a gridworld navigation problem where the player traverses a frozen lake from the start tile (S) to the goal tile (G), avoiding holes (H) on frozen tiles (F).
Map System: Two pre-defined maps are included: 4x4 (16 states) and 8x8 (64 states). Custom maps can be provided as lists of strings. The generate_random_map(size, p, seed) helper function creates random valid maps using DFS to guarantee a path from start to goal exists, with p controlling the probability of frozen tiles.
Slippery Physics: When is_slippery=True (default), the agent moves in the intended direction with success_rate probability (default 1/3) and in each perpendicular direction with (1 - success_rate) / 2 probability. This models the slippery nature of ice and makes the problem stochastic.
Transition Model: Pre-computed in __init__ and stored in self.P[state][action] as lists of (probability, next_state, reward, terminated) tuples. Reaching G or H tiles are terminal states. Walking into the grid boundary results in staying in place.
Reward Schedule: Configurable via reward_schedule=(goal_reward, hole_reward, frozen_reward) with default (1, 0, 0).
Action Space: Discrete(4) with 0=left, 1=down, 2=right, 3=up.
Observation: Discrete(nS) where nS = nrow * ncol. The state is the player's position as row * ncol + col.
Rendering: Supports "human" (PyGame window), "rgb_array" (numpy pixel array), and "ansi" (colored text grid). PyGame rendering uses ice, hole, cracked-hole, goal, start, and elf sprite tiles.
Usage
Use this environment for tabular RL experimentation, policy iteration, value iteration, Q-learning, and SARSA. The slippery default makes it a useful testbed for stochastic MDPs. Create via gymnasium.make("FrozenLake-v1") or gymnasium.make("FrozenLake8x8-v1").
Code Reference
Source Location
- Repository: Farama_Foundation_Gymnasium
- File:
gymnasium/envs/toy_text/frozen_lake.py
Signature
def generate_random_map(size: int = 8, p: float = 0.8, seed: int | None = None) -> list[str]
class FrozenLakeEnv(Env):
def __init__(
self,
render_mode: str | None = None,
desc: list[str] = None,
map_name: str = "4x4",
is_slippery: bool = True,
success_rate: float = 1.0 / 3.0,
reward_schedule: tuple[int, int, int] = (1, 0, 0),
)
def step(self, a) -> tuple[int, float, bool, bool, dict]
def reset(self, *, seed: int | None = None, options: dict | None = None) -> tuple[int, dict]
def render(self) -> str | np.ndarray | None
Import
import gymnasium as gym
env = gym.make("FrozenLake-v1")
# With custom map
from gymnasium.envs.toy_text.frozen_lake import generate_random_map
env = gym.make("FrozenLake-v1", desc=generate_random_map(size=12, p=0.9))
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| render_mode | str or None | No | "human", "rgb_array", or "ansi" |
| desc | list[str] or None | No | Custom map as list of strings (S=start, G=goal, F=frozen, H=hole) |
| map_name | str | No | Pre-defined map name: "4x4" or "8x8" (default "4x4") |
| is_slippery | bool | No | Enable stochastic transitions (default True) |
| success_rate | float | No | Probability of moving in intended direction (default 1/3) |
| reward_schedule | tuple[int,int,int] | No | (goal, hole, frozen) rewards (default (1, 0, 0)) |
| a | int (0-3) | Yes (step) | 0=left, 1=down, 2=right, 3=up |
Outputs
| Name | Type | Description |
|---|---|---|
| observation | int | Grid position (0 to nS-1) |
| reward | float | Depends on reward_schedule (default: 1 for goal, 0 otherwise) |
| terminated | bool | True when reaching G (goal) or H (hole) |
| truncated | bool | Always False (TimeLimit wrapper handles truncation) |
| info | dict | {"prob": float} with transition probability |
Usage Examples
import gymnasium as gym
# Default 4x4 slippery lake
env = gym.make("FrozenLake-v1")
obs, info = env.reset(seed=42)
# Non-slippery for deterministic testing
env = gym.make("FrozenLake-v1", is_slippery=False)
obs, info = env.reset(seed=42)
obs, reward, terminated, truncated, info = env.step(2) # Move right
print(f"State: {obs}, Prob: {info['prob']}") # Probability is 1.0
# Custom reward schedule penalizing holes
env = gym.make("FrozenLake-v1", reward_schedule=(10, -5, -0.1))
# Random 12x12 map
from gymnasium.envs.toy_text.frozen_lake import generate_random_map
env = gym.make("FrozenLake-v1", desc=generate_random_map(size=12, p=0.85, seed=42))
# Access transition model for planning
print(env.unwrapped.P[0][2]) # Transitions from state 0 going right
env.close()