Implementation:Farama Foundation Gymnasium VectorEnv Step Reset
| Knowledge Sources | |
|---|---|
| Domains | Reinforcement_Learning, Parallelism |
| Last Updated | 2026-02-15 03:00 GMT |
Overview
Concrete tool for batched step and reset interaction with vectorized environments provided by the Gymnasium library.
Description
The VectorEnv abstract base class provides batched step and reset methods that operate across all sub-environments simultaneously. Observations are returned as arrays with the first dimension being num_envs. Rewards, terminated, and truncated signals are also batched as 1D arrays.
Usage
Use envs.reset() to initialize all sub-environments and envs.step(actions) with a batched action array to advance all environments. Access envs.single_observation_space and envs.single_action_space for per-environment space information.
Code Reference
Source Location
- Repository: Gymnasium
- File: gymnasium/vector/vector_env.py
- Lines: L40-200
Signature
class VectorEnv:
num_envs: int
single_observation_space: Space
single_action_space: Space
observation_space: Space # Batched
action_space: Space # Batched
def reset(
self,
*,
seed: int | list[int] | None = None,
options: dict | None = None,
) -> tuple[ObsType, dict]:
"""Reset all sub-environments.
Args:
seed: Seed(s) for the environments' PRNGs.
options: Reset options.
Returns:
observations: Batched initial observations (num_envs, *obs_shape).
infos: Batched info dictionaries.
"""
def step(
self, actions: ActType
) -> tuple[ObsType, ArrayType, ArrayType, ArrayType, dict]:
"""Step all sub-environments with batched actions.
Args:
actions: Batched actions of shape (num_envs, *act_shape).
Returns:
observations: (num_envs, *obs_shape)
rewards: (num_envs,)
terminateds: (num_envs,)
truncateds: (num_envs,)
infos: Batched info dict
"""
Import
import gymnasium as gym
envs = gym.make_vec("CartPole-v1", num_envs=4)
# envs is a VectorEnv instance
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| actions (step) | ndarray | Yes | Batched actions, shape (num_envs, *act_shape) |
| seed (reset) | int or list[int] | No | PRNG seed(s) for reproducibility |
Outputs
| Name | Type | Description |
|---|---|---|
| step() | tuple | (obs, rewards, terminateds, truncateds, infos) all batched |
| reset() | tuple | (obs, infos) batched |
Usage Examples
A2C Data Collection
import gymnasium as gym
import numpy as np
envs = gym.make_vec("CartPole-v1", num_envs=8)
obs, infos = envs.reset(seed=42)
# Collect a rollout of T steps
T = 128
all_obs = np.zeros((T, envs.num_envs, *envs.single_observation_space.shape))
all_rewards = np.zeros((T, envs.num_envs))
all_dones = np.zeros((T, envs.num_envs))
for t in range(T):
actions = envs.action_space.sample() # (8,)
obs, rewards, terminateds, truncateds, infos = envs.step(actions)
all_obs[t] = obs
all_rewards[t] = rewards
all_dones[t] = np.logical_or(terminateds, truncateds)
envs.close()