Heuristic:Isaac sim IsaacGymEnvs Determinism Performance Tradeoff

Knowledge Sources	IsaacGymEnvs Reproducibility Docs PyTorch Reproducibility
Domains	Optimization, Reproducibility
Last Updated	2026-02-15 09:00 GMT

Overview

Keep `torch_deterministic=False` (default) for maximum training speed; only enable for debugging reproducibility issues, and avoid PyTorch 1.9/1.9.1 which have determinism bugs.

Description

IsaacGymEnvs defaults to non-deterministic training (`torch_deterministic: False`) which enables `cudnn.benchmark=True` for faster convolution algorithm selection. Enabling determinism (`torch_deterministic: True`) forces `cudnn.benchmark=False`, `cudnn.deterministic=True`, `torch.use_deterministic_algorithms(True)`, and sets `CUBLAS_WORKSPACE_CONFIG=':4096:8'`. This trades significant training speed for bit-exact reproducibility across runs. However, even with determinism enabled, GPU work scheduling during domain randomization can still cause divergence.

Usage

Use this heuristic when deciding between training speed and reproducibility. For regular training and hyperparameter search, keep the default (`False`). Only enable determinism when debugging non-reproducible results or when exact comparison between runs is required. Be aware that PyTorch 1.9 and 1.9.1 have known bugs that cause crashes with `torch_deterministic=True`.

The Insight (Rule of Thumb)

Action: Leave `torch_deterministic: False` in `config.yaml` for production training. Set `seed: 42` for consistent initialization. Use `torch_deterministic: True` only for debugging.
Value: Default seed is 42. If `torch_deterministic=True` and `seed=-1`, the seed is forced to 42.
Trade-off: Deterministic mode enables `cudnn.benchmark=False` which prevents cuDNN from auto-tuning convolution algorithms, reducing training speed. It also sets `CUBLAS_WORKSPACE_CONFIG` which constrains CUBLAS memory usage.
Caveat: Even with full determinism enabled, runtime domain randomization of object scales and masses can still cause non-deterministic behavior due to CPU-to-GPU parameter passing in lower-level APIs.

Reasoning

GPU parallel execution is inherently non-deterministic at the floating-point level. Operations on thousands of parallel environments are scheduled by the GPU hardware, and small differences in execution order cause least-significant-bit variations that accumulate over thousands of frames. The `cudnn.benchmark=True` setting compounds this by selecting different (potentially non-deterministic) algorithms per input size.

From `utils.py:87-113`:

def set_seed(seed, torch_deterministic=False, rank=0):
    if seed == -1 and torch_deterministic:
        seed = 42 + rank
    elif seed == -1:
        seed = np.random.randint(0, 10000)
    # ...
    if torch_deterministic:
        os.environ['CUBLAS_WORKSPACE_CONFIG'] = ':4096:8'
        torch.backends.cudnn.benchmark = False
        torch.backends.cudnn.deterministic = True
        torch.use_deterministic_algorithms(True)
    else:
        torch.backends.cudnn.benchmark = True
        torch.backends.cudnn.deterministic = False

From `docs/reproducibility.md` on DR limitations:

Runtime domain randomization of object scales or masses are known to cause both determinacy and simulation issues when running on the GPU due to the way those parameters are passed from CPU to GPU in lower level APIs. By default, we use the `setup_only` flag to only randomize scales and masses once before simulation starts.

From `docs/reproducibility.md` on PyTorch version bugs:

In PyTorch version 1.9 and 1.9.1 there appear to be bugs affecting the `torch_deterministic` setting, and using this mode will result in a crash.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment