Implementation:Farama Foundation Gymnasium VectorEnv Step Reset

Knowledge Sources	Gymnasium Gymnasium Vector
Domains	Reinforcement_Learning, Parallelism
Last Updated	2026-02-15 03:00 GMT

Overview

Concrete tool for batched step and reset interaction with vectorized environments provided by the Gymnasium library.

Description

The VectorEnv abstract base class provides batched step and reset methods that operate across all sub-environments simultaneously. Observations are returned as arrays with the first dimension being num_envs. Rewards, terminated, and truncated signals are also batched as 1D arrays.

Usage

Use envs.reset() to initialize all sub-environments and envs.step(actions) with a batched action array to advance all environments. Access envs.single_observation_space and envs.single_action_space for per-environment space information.

Code Reference

Source Location

Repository: Gymnasium
File: gymnasium/vector/vector_env.py
Lines: L40-200

Signature

class VectorEnv:
    num_envs: int
    single_observation_space: Space
    single_action_space: Space
    observation_space: Space  # Batched
    action_space: Space       # Batched

    def reset(
        self,
        *,
        seed: int | list[int] | None = None,
        options: dict | None = None,
    ) -> tuple[ObsType, dict]:
        """Reset all sub-environments.

        Args:
            seed: Seed(s) for the environments' PRNGs.
            options: Reset options.

        Returns:
            observations: Batched initial observations (num_envs, *obs_shape).
            infos: Batched info dictionaries.
        """

    def step(
        self, actions: ActType
    ) -> tuple[ObsType, ArrayType, ArrayType, ArrayType, dict]:
        """Step all sub-environments with batched actions.

        Args:
            actions: Batched actions of shape (num_envs, *act_shape).

        Returns:
            observations: (num_envs, *obs_shape)
            rewards: (num_envs,)
            terminateds: (num_envs,)
            truncateds: (num_envs,)
            infos: Batched info dict
        """

Import

import gymnasium as gym

envs = gym.make_vec("CartPole-v1", num_envs=4)
# envs is a VectorEnv instance

I/O Contract

Inputs

Name	Type	Required	Description
actions (step)	ndarray	Yes	Batched actions, shape (num_envs, *act_shape)
seed (reset)	int or list[int]	No	PRNG seed(s) for reproducibility

Outputs

Name	Type	Description
step()	tuple	(obs, rewards, terminateds, truncateds, infos) all batched
reset()	tuple	(obs, infos) batched

Usage Examples

A2C Data Collection

import gymnasium as gym
import numpy as np

envs = gym.make_vec("CartPole-v1", num_envs=8)
obs, infos = envs.reset(seed=42)

# Collect a rollout of T steps
T = 128
all_obs = np.zeros((T, envs.num_envs, *envs.single_observation_space.shape))
all_rewards = np.zeros((T, envs.num_envs))
all_dones = np.zeros((T, envs.num_envs))

for t in range(T):
    actions = envs.action_space.sample()  # (8,)
    obs, rewards, terminateds, truncateds, infos = envs.step(actions)
    all_obs[t] = obs
    all_rewards[t] = rewards
    all_dones[t] = np.logical_or(terminateds, truncateds)

envs.close()

Related Pages

Implements Principle

Principle:Farama_Foundation_Gymnasium_Batched_Environment_Interaction

Requires Environment

Environment:Farama_Foundation_Gymnasium_Python_3_10_Runtime

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment