Principle:Isaac sim IsaacGymEnvs ADR State Checkpointing

**Metadata**
Knowledge Sources	IsaacGymEnvs DeXtreme
Domains	Training Persistence
Last Updated	2026-02-15 00:00 GMT

Overview

Mechanism for serializing and restoring the complete ADR training state alongside model checkpoints for seamless training resumption. This principle ensures that the full domain randomization state -- parameter ranges, boundary evaluation queues, worker assignments, and per-environment randomized values -- can be saved and restored without loss of training progress.

Description

ADR training involves complex mutable state beyond model weights that must be preserved across checkpoint/resume cycles:

ADR parameter ranges: The current [lo, hi] range for each ADR parameter, which may have been expanded or contracted from the initial values over potentially millions of training steps.
Boundary evaluation queues: Per-parameter, per-direction deques of boundary worker performance results. These queues take time to fill (typically 256 entries) and losing them resets the ADR algorithm's progress toward evaluating the current boundary.
Worker type assignments: Per-environment worker type (rollout, boundary, test) and ADR mode (which parameter and direction each boundary worker is evaluating).
ADR tensor values: Per-environment values for tensor-based ADR parameters (affine noise coefficients, delay probabilities, etc.).
Environment-specific state: Task-level state such as the action moving average scalar, cube random parameters, and hand random parameters.

The get_env_state() / set_env_state() interface captures this state as a serializable dictionary that is saved alongside model checkpoints. The rl_games training framework calls get_env_state() when saving and set_env_state() when loading.

Usage

State checkpointing is transparent to the user -- it happens automatically when checkpoints are saved during training. To resume ADR training from a checkpoint:

python train.py task=AllegroHandDextremeADR checkpoint=runs/AllegroHandADR/.../nn/last.pth

The adr_load_from_checkpoint flag controls whether ADR parameters are restored from the checkpoint or re-initialized:

task:
  adr:
    adr_load_from_checkpoint: true   # Restore ADR ranges from checkpoint
    # adr_load_from_checkpoint: false  # Start with fresh init_range values

Theoretical Basis

ADR state checkpointing follows the state serialization pattern: capture all mutable training state in a dictionary that can be saved and loaded via torch.save() / torch.load() alongside the model checkpoint.

CHECKPOINT_SAVE:
    model_state = policy.state_dict()
    env_state = env.get_env_state()     # ADR params, queues, worker types, tensors
    save({model_state, env_state}, path)

CHECKPOINT_LOAD:
    checkpoint = load(path)
    policy.load_state_dict(checkpoint.model_state)
    env.set_env_state(checkpoint.env_state)  # Restore ADR state

Key design considerations:

Completeness: All mutable state that affects future training behavior must be captured. Missing any component (e.g., queues) would cause the ADR algorithm to behave differently after resume.
Selectivity: The adr_load_from_checkpoint flag allows intentional re-initialization of ADR ranges (e.g., when fine-tuning a pre-trained policy with fresh ADR).
Hierarchical state: The state is composed across the class hierarchy -- AllegroHandDextreme adds task-specific tensors (cube/hand random params) to the state returned by ADRVecTask and VecTaskDextreme.

Related Pages

Implementation:Isaac_sim_IsaacGymEnvs_ADR_Get_Set_Env_State

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment