Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Farama Foundation Gymnasium Seeding Determinism Best Practices

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Debugging
Last Updated 2026-02-15 03:00 GMT

Overview

Seeding pattern for reproducible environments: pass seed once at first `reset()` and call `super().reset(seed=seed)` as the first line in custom environments.

Description

Gymnasium environments use a NumPy PRNG (`np_random`) for stochastic state transitions. The seeding protocol has specific semantics: calling `reset(seed=N)` initializes the PRNG deterministically, while `reset(seed=None)` preserves the existing PRNG state. Custom environment implementations must call `super().reset(seed=seed)` as the first line of their `reset()` method for the seeding to work correctly. The `check_env()` validator verifies that seeded resets produce deterministic initial states and warns about common mistakes.

Usage

Use this heuristic when implementing custom environments, debugging non-deterministic behavior, or setting up reproducible experiments. Particularly important when environments produce different results across runs despite setting the same seed.

The Insight (Rule of Thumb)

  • Action 1: In custom environments, the first line of `reset()` must be `super().reset(seed=seed)`.
  • Action 2: Call `reset(seed=42)` once right after `gymnasium.make()`, then never pass a seed again.
  • Action 3: Do not set a non-None default for the `seed` parameter in your custom `reset()` — the default must be `None`.
  • Action 4: For reproducible action sampling, separately seed the action space: `env.action_space.seed(123)`.
  • Trade-off: Seeding provides reproducibility at the cost of fixed exploration patterns. For training, vary seeds across runs.

Reasoning

If `super().reset(seed=seed)` is not called, the environment's `_np_random` PRNG is never initialized from the seed, making the environment non-deterministic regardless of the seed passed. The `check_env()` validator catches this by calling `reset(seed=123)` twice and comparing observations.

Setting a non-None default seed (e.g., `def reset(self, seed=42)`) makes the environment always deterministic even when the user expects randomness, since every `reset()` call without an explicit seed would re-initialize the PRNG to the same state.

The pattern of "seed once, then never again" exists because `reset(seed=None)` preserves the PRNG state, allowing the sequence of states to vary naturally across episodes while remaining reproducible from the initial seed.

Code Evidence

Seeding protocol from `gymnasium/core.py:126-143`:

# reset() should (in the typical use case) be called with a seed right after
# initialization and then never again.

# For Custom environments, the first line of reset() should be
# super().reset(seed=seed) which implements the seeding correctly.

# If seed=None and the PRNG already exists, the PRNG will NOT be reset.
# Pass an integer to force reset.

Default seed validation from `gymnasium/utils/passive_env_checker.py:173-176`:

if signature.parameters["seed"].default is not None:
    logger.warn(
        "The default seed argument in `Env.reset` should be `None`, otherwise the environment will by default always be deterministic."
    )

Determinism check from `gymnasium/utils/env_checker.py:93-95`:

# If seed is passed to reset, the environment MUST call super().reset(seed=seed),
# otherwise the random number generator won't be properly initialized.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment