Heuristic:Facebookresearch Habitat lab Force Single Threaded PyTorch

Knowledge Sources	Habitat-Lab Core Team
Domains	Optimization, Reinforcement_Learning
Last Updated	2026-02-15 00:00 GMT

Overview

Counter-intuitive performance optimization: forcing PyTorch to single-threaded mode significantly speeds up RL training by avoiding parallel memory copy overhead.

Description

PyTorch increasingly parallelizes internal memory copy operations across threads. In Habitat-Lab RL training, CPU-side operations are dominated by simple memory copies (rollout buffer management, observation transfers) rather than compute-heavy operations. The parallelization overhead (thread creation, synchronization) dramatically slows down these lightweight operations. Setting `force_torch_single_threaded=True` eliminates this overhead.

Usage

Use this heuristic whenever running RL training in Habitat-Lab (PPO, DD-PPO, VER). The config default is `False` (to match standard PyTorch behavior), but all provided training configs set it to True, and custom configs should do the same.

The Insight (Rule of Thumb)

Action: Set `habitat_baselines.force_torch_single_threaded: True` in your training config.
Value: `True` (boolean flag).
Trade-off: None observed in practice. The default is `False` only for compatibility with standard PyTorch behavior, not because `True` has downsides.
Scope: Affects all CPU-side PyTorch operations in the training process.

Reasoning

The Habitat-Lab team documented this directly in the config dataclass with an explanatory comment. The insight is that RL training workloads differ fundamentally from typical deep learning training: the CPU is used primarily for environment simulation and data movement, not matrix math. Parallel memory copies add synchronization overhead that exceeds the copy time itself, making single-threaded execution faster.

Code evidence from `habitat-baselines/habitat_baselines/config/default_structured_configs.py:479-487`:

# For our use case, the CPU side things are mainly memory copies
# and nothing of substantive compute. PyTorch has been making
# more and more memory copies parallel, but that just ends up
# slowing those down dramatically and reducing our perf.
# This forces it to be single threaded.  The default
# value is left as false as it's different from how
# PyTorch normally behaves, but all configs we provide
# set it to true and yours likely should too
force_torch_single_threaded: bool = False

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment