Heuristic:Farama Foundation Gymnasium Seeding Determinism Best Practices
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Debugging |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Seeding pattern for reproducible environments: pass seed once at first `reset()` and call `super().reset(seed=seed)` as the first line in custom environments.
Description
Gymnasium environments use a NumPy PRNG (`np_random`) for stochastic state transitions. The seeding protocol has specific semantics: calling `reset(seed=N)` initializes the PRNG deterministically, while `reset(seed=None)` preserves the existing PRNG state. Custom environment implementations must call `super().reset(seed=seed)` as the first line of their `reset()` method for the seeding to work correctly. The `check_env()` validator verifies that seeded resets produce deterministic initial states and warns about common mistakes.
Usage
Use this heuristic when implementing custom environments, debugging non-deterministic behavior, or setting up reproducible experiments. Particularly important when environments produce different results across runs despite setting the same seed.
The Insight (Rule of Thumb)
- Action 1: In custom environments, the first line of `reset()` must be `super().reset(seed=seed)`.
- Action 2: Call `reset(seed=42)` once right after `gymnasium.make()`, then never pass a seed again.
- Action 3: Do not set a non-None default for the `seed` parameter in your custom `reset()` — the default must be `None`.
- Action 4: For reproducible action sampling, separately seed the action space: `env.action_space.seed(123)`.
- Trade-off: Seeding provides reproducibility at the cost of fixed exploration patterns. For training, vary seeds across runs.
Reasoning
If `super().reset(seed=seed)` is not called, the environment's `_np_random` PRNG is never initialized from the seed, making the environment non-deterministic regardless of the seed passed. The `check_env()` validator catches this by calling `reset(seed=123)` twice and comparing observations.
Setting a non-None default seed (e.g., `def reset(self, seed=42)`) makes the environment always deterministic even when the user expects randomness, since every `reset()` call without an explicit seed would re-initialize the PRNG to the same state.
The pattern of "seed once, then never again" exists because `reset(seed=None)` preserves the PRNG state, allowing the sequence of states to vary naturally across episodes while remaining reproducible from the initial seed.
Code Evidence
Seeding protocol from `gymnasium/core.py:126-143`:
# reset() should (in the typical use case) be called with a seed right after
# initialization and then never again.
# For Custom environments, the first line of reset() should be
# super().reset(seed=seed) which implements the seeding correctly.
# If seed=None and the PRNG already exists, the PRNG will NOT be reset.
# Pass an integer to force reset.
Default seed validation from `gymnasium/utils/passive_env_checker.py:173-176`:
if signature.parameters["seed"].default is not None:
logger.warn(
"The default seed argument in `Env.reset` should be `None`, otherwise the environment will by default always be deterministic."
)
Determinism check from `gymnasium/utils/env_checker.py:93-95`:
# If seed is passed to reset, the environment MUST call super().reset(seed=seed),
# otherwise the random number generator won't be properly initialized.
Related Pages
- Implementation:Farama_Foundation_Gymnasium_Env_Step_Reset
- Implementation:Farama_Foundation_Gymnasium_Env_Subclass_Interface
- Implementation:Farama_Foundation_Gymnasium_Check_Env
- Principle:Farama_Foundation_Gymnasium_Environment_Interaction_Loop
- Principle:Farama_Foundation_Gymnasium_Custom_Environment_Implementation