Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Farama Foundation Gymnasium Batched Environment Interaction

From Leeroopedia
Knowledge Sources
Domains Reinforcement_Learning, Parallelism
Last Updated 2026-02-15 03:00 GMT

Overview

An extension of the standard environment interaction protocol that operates on batches of observations and actions across multiple parallel environments.

Description

Batched Environment Interaction extends the single-environment step/reset protocol to vector environments. The key differences from single-environment interaction:

  • reset() returns observations of shape (num_envs, *obs_shape) instead of (*obs_shape)
  • step(actions) accepts actions of shape (num_envs, *act_shape) and returns batched observations, rewards, terminateds, truncateds, and infos
  • Autoreset: Sub-environments automatically reset when they terminate/truncate, with the new observation available at the next step

The batched interface enables efficient GPU utilization by processing all environment data in a single forward pass through the neural network.

Usage

Use this protocol when interacting with VectorEnv instances for deep RL training. The batched interface is used by A2C, PPO, and other on-policy algorithms that collect fixed-length rollouts from multiple environments.

Theoretical Basis

Batched MDP interaction:

{oi,ri,di,ti}i=1N=envs.step({ai}i=1N)

With automatic reset: when di=True, the next call to step uses the observation from the auto-reset for environment i.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment