Heuristic:Facebookresearch Habitat lab Mini Batch Environment Divisibility
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Reinforcement_Learning |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Training performance degrades when `num_environments` is not evenly divisible by `num_mini_batch`; ensure these values divide evenly or training may produce uneven batch sizes.
Description
The PPO rollout storage splits environment data into mini-batches by dividing the environment indices. When `num_environments % num_mini_batch != 0`, some mini-batches will have fewer samples than others. This unevenness causes gradient estimates to vary in quality across batches, harming training stability and performance. The codebase includes both a hard assertion (`num_environments >= num_mini_batch`) and a warning for non-divisibility.
Usage
Apply this heuristic when configuring any PPO-based training run. Before launching training, verify that `habitat_baselines.num_environments` is a multiple of the PPO `num_mini_batch` setting. Common valid combinations: 16 envs / 4 batches, 32 envs / 8 batches, 64 envs / 4 batches.
The Insight (Rule of Thumb)
- Action: Ensure `num_environments % num_mini_batch == 0` in your config.
- Value: Any combination where environments divide evenly (e.g., 16/4, 32/8, 64/4).
- Trade-off: Constrains the choice of num_environments and num_mini_batch values.
- Hard constraint: `num_environments >= num_mini_batch` (assertion failure if violated).
Reasoning
The rollout storage implementation uses `torch.randperm(num_environments).chunk(num_mini_batch)` to create mini-batches. When the division is uneven, `chunk()` produces tensors of different lengths. This means some gradient updates use more data than others, which can lead to inconsistent learning dynamics and slower convergence.
Code evidence from `habitat-baselines/habitat_baselines/common/rollout_storage.py:214-228`:
assert num_environments >= num_mini_batch, (
"Trainer requires the number of environments ({}) "
"to be greater than or equal to the number of "
"trainer mini batches ({}).".format(
num_environments, num_mini_batch
)
)
if num_environments % num_mini_batch != 0:
warnings.warn(
"Number of environments ({}) is not a multiple of the"
" number of mini batches ({}). This results in mini batches"
" of different sizes, which can harm training performance.".format(
num_environments, num_mini_batch
)
)