Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Farama Foundation Gymnasium VectorEnv Step Reset

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Parallelism
Last Updated 2026-02-15 03:00 GMT

Overview

Concrete tool for batched step and reset interaction with vectorized environments provided by the Gymnasium library.

Description

The VectorEnv abstract base class provides batched step and reset methods that operate across all sub-environments simultaneously. Observations are returned as arrays with the first dimension being num_envs. Rewards, terminated, and truncated signals are also batched as 1D arrays.

Usage

Use envs.reset() to initialize all sub-environments and envs.step(actions) with a batched action array to advance all environments. Access envs.single_observation_space and envs.single_action_space for per-environment space information.

Code Reference

Source Location

  • Repository: Gymnasium
  • File: gymnasium/vector/vector_env.py
  • Lines: L40-200

Signature

class VectorEnv:
    num_envs: int
    single_observation_space: Space
    single_action_space: Space
    observation_space: Space  # Batched
    action_space: Space       # Batched

    def reset(
        self,
        *,
        seed: int | list[int] | None = None,
        options: dict | None = None,
    ) -> tuple[ObsType, dict]:
        """Reset all sub-environments.

        Args:
            seed: Seed(s) for the environments' PRNGs.
            options: Reset options.

        Returns:
            observations: Batched initial observations (num_envs, *obs_shape).
            infos: Batched info dictionaries.
        """

    def step(
        self, actions: ActType
    ) -> tuple[ObsType, ArrayType, ArrayType, ArrayType, dict]:
        """Step all sub-environments with batched actions.

        Args:
            actions: Batched actions of shape (num_envs, *act_shape).

        Returns:
            observations: (num_envs, *obs_shape)
            rewards: (num_envs,)
            terminateds: (num_envs,)
            truncateds: (num_envs,)
            infos: Batched info dict
        """

Import

import gymnasium as gym

envs = gym.make_vec("CartPole-v1", num_envs=4)
# envs is a VectorEnv instance

I/O Contract

Inputs

Name Type Required Description
actions (step) ndarray Yes Batched actions, shape (num_envs, *act_shape)
seed (reset) int or list[int] No PRNG seed(s) for reproducibility

Outputs

Name Type Description
step() tuple (obs, rewards, terminateds, truncateds, infos) all batched
reset() tuple (obs, infos) batched

Usage Examples

A2C Data Collection

import gymnasium as gym
import numpy as np

envs = gym.make_vec("CartPole-v1", num_envs=8)
obs, infos = envs.reset(seed=42)

# Collect a rollout of T steps
T = 128
all_obs = np.zeros((T, envs.num_envs, *envs.single_observation_space.shape))
all_rewards = np.zeros((T, envs.num_envs))
all_dones = np.zeros((T, envs.num_envs))

for t in range(T):
    actions = envs.action_space.sample()  # (8,)
    obs, rewards, terminateds, truncateds, infos = envs.step(actions)
    all_obs[t] = obs
    all_rewards[t] = rewards
    all_dones[t] = np.logical_or(terminateds, truncateds)

envs.close()

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment