Heuristic:Facebookresearch Habitat lab Mini Batch Environment Divisibility

Knowledge Sources	Habitat-Lab Core Team
Domains	Optimization, Reinforcement_Learning
Last Updated	2026-02-15 00:00 GMT

Overview

Training performance degrades when `num_environments` is not evenly divisible by `num_mini_batch`; ensure these values divide evenly or training may produce uneven batch sizes.

Description

The PPO rollout storage splits environment data into mini-batches by dividing the environment indices. When `num_environments % num_mini_batch != 0`, some mini-batches will have fewer samples than others. This unevenness causes gradient estimates to vary in quality across batches, harming training stability and performance. The codebase includes both a hard assertion (`num_environments >= num_mini_batch`) and a warning for non-divisibility.

Usage

Apply this heuristic when configuring any PPO-based training run. Before launching training, verify that `habitat_baselines.num_environments` is a multiple of the PPO `num_mini_batch` setting. Common valid combinations: 16 envs / 4 batches, 32 envs / 8 batches, 64 envs / 4 batches.

The Insight (Rule of Thumb)

Action: Ensure `num_environments % num_mini_batch == 0` in your config.
Value: Any combination where environments divide evenly (e.g., 16/4, 32/8, 64/4).
Trade-off: Constrains the choice of num_environments and num_mini_batch values.
Hard constraint: `num_environments >= num_mini_batch` (assertion failure if violated).

Reasoning

The rollout storage implementation uses `torch.randperm(num_environments).chunk(num_mini_batch)` to create mini-batches. When the division is uneven, `chunk()` produces tensors of different lengths. This means some gradient updates use more data than others, which can lead to inconsistent learning dynamics and slower convergence.

Code evidence from `habitat-baselines/habitat_baselines/common/rollout_storage.py:214-228`:

assert num_environments >= num_mini_batch, (
    "Trainer requires the number of environments ({}) "
    "to be greater than or equal to the number of "
    "trainer mini batches ({}).".format(
        num_environments, num_mini_batch
    )
)
if num_environments % num_mini_batch != 0:
    warnings.warn(
        "Number of environments ({}) is not a multiple of the"
        " number of mini batches ({}).  This results in mini batches"
        " of different sizes, which can harm training performance.".format(
            num_environments, num_mini_batch
        )
    )

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment