Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Isaac sim IsaacGymEnvs Train Py Task Execution

From Leeroopedia
Knowledge Sources
Type Wrapper Doc
Domains Development, Testing
Last Updated 2026-02-15 00:00 GMT

Overview

Concrete entry points and commands for executing training, testing, and debugging of IsaacGymEnvs tasks through both the CLI and programmatic API.

Description

IsaacGymEnvs provides two entry points for running tasks: the CLI via train.py (isaacgymenvs/train.py:L71-215) and the programmatic API via isaacgymenvs.make() (isaacgymenvs/__init__.py:L14-55). Both entry points compose Hydra configurations, look up the task in isaacgym_task_map, instantiate the environment, and either launch rl_games training or return the environment for custom use.

Usage

Use the CLI for standard training and evaluation. Use the programmatic API for custom training loops, integration with other frameworks, or automated experimentation.

Code Reference

Source Location

Entry Point 1: CLI via train.py

Signature

python train.py task=<TaskName> [key=value overrides...]

Core Parameters

Parameter Type Default Description
task string Required Task name (must match isaacgym_task_map key and cfg/task/{name}.yaml)
num_envs int From task YAML Number of parallel environments
headless bool False Disable rendering (True for server training)
seed int 42 Random seed for reproducibility
max_iterations int From train YAML Maximum training epochs
sim_device string "cuda:0" Device for physics simulation
rl_device string "cuda:0" Device for RL computations
test bool False Run in evaluation mode (no training)
checkpoint string "" Path to checkpoint for resuming or testing
experiment string "" Experiment name for logging
multi_gpu bool False Enable multi-GPU training

Entry Point 2: Programmatic API

Signature

import isaacgymenvs

env = isaacgymenvs.make(
    seed=0,
    task="MyTask",
    num_envs=64,
    sim_device="cuda:0",
    rl_device="cuda:0",
    graphics_device_id=0,
    headless=False,
    multi_gpu=False,
    virtual_screen_capture=False,
    force_render=False,
)

API Usage Example

import isaacgymenvs
import torch

# Create environment
env = isaacgymenvs.make(
    seed=42,
    task="Cartpole",
    num_envs=64,
    sim_device="cuda:0",
    rl_device="cuda:0",
)

# Reset and run
obs = env.reset()
for step in range(1000):
    # Random actions for testing
    actions = torch.randn(env.num_envs, env.num_actions, device=env.rl_device)
    obs, rewards, dones, info = env.step(actions)
    print(f"Step {step}: mean_reward={rewards.mean().item():.3f}, "
          f"num_resets={dones.sum().item()}")

Debugging Commands

Stage 1: Visual Verification

# Run with few environments and rendering for visual inspection
python train.py task=MyTask num_envs=4 headless=False max_iterations=5

# What to look for:
# - Assets appear correctly in the viewer
# - Physics is stable (no explosions, no interpenetration)
# - Environments reset properly
# - Use keyboard: V to toggle viewer, R to reset, Esc to quit

Stage 2: Quick Training Sanity Check

# Short training run to verify reward signals
python train.py task=MyTask num_envs=256 max_iterations=50 headless=True

# What to look for in output:
# - rewards/step should not be constant
# - rewards/step should show some variation or improvement
# - No NaN or Inf errors

Stage 3: Moderate Training Run

# Medium-scale training to verify learning
python train.py task=MyTask num_envs=1024 max_iterations=200 headless=True

# Monitor with TensorBoard:
# tensorboard --logdir runs/MyTask/summaries

Stage 4: Full-Scale Training

# Full training with default parameters from YAML
python train.py task=MyTask headless=True

# Outputs saved to:
# - runs/MyTask/nn/MyTask.pth       (best checkpoint)
# - runs/MyTask/nn/last_MyTask.pth  (last checkpoint)
# - runs/MyTask/summaries/          (TensorBoard logs)

Stage 5: Evaluation

# Evaluate trained policy with rendering
python train.py task=MyTask test=True \
    checkpoint=runs/MyTask/nn/MyTask.pth \
    num_envs=16 headless=False

# What to look for:
# - Agent exhibits desired behavior
# - No reward exploitation or unexpected strategies

Key Debugging Parameters

Parameter Purpose Recommended Value for Debugging
num_envs Control parallelism Start with 4-16 for visual debugging, scale to 256+ for training
headless=False Enable viewer Use for visual verification and policy evaluation
max_iterations Limit training duration Use 5-10 for physics testing, 50-100 for reward checking
seed Reproducibility Fix seed when comparing configurations
test=True Evaluation mode Use with checkpoint to evaluate trained policy
checkpoint Resume or evaluate Path to saved .pth checkpoint file

What to Monitor During Training

Metric Location Healthy Behavior Warning Signs
Mean reward TensorBoard: rewards/step Increasing trend Flat, decreasing, or NaN
Episode length TensorBoard: episode_lengths/step Increasing (survival) or decreasing (goal-reaching) Constant at max_episode_length (reward too sparse)
Policy entropy TensorBoard: entropy/step Gradual decrease Rapid collapse to 0 (premature convergence)
Value loss TensorBoard: losses/value_loss Decreasing Increasing or oscillating wildly
Learning rate TensorBoard: info/learning_rate Stable or gradually decreasing (adaptive) Rapid oscillation

I/O Contract

Inputs

Name Type Required Description
Registered task class isaacgym_task_map entry Yes Task must be registered in the task map
Task YAML config cfg/task/MyTask.yaml Yes Environment parameters
Train YAML config cfg/train/MyTaskPPO.yaml Yes Training hyperparameters
CLI overrides key=value pairs No Hydra overrides for any configuration parameter

Outputs

Name Type Description
Checkpoints .pth files Saved policy network weights in runs/{TaskName}/nn/
TensorBoard logs event files Training metrics in runs/{TaskName}/summaries/
Visual verification Isaac Gym viewer Real-time rendering when headless=False
Console output Text Per-epoch reward, episode length, and timing statistics

Related Pages

Principle:Isaac_sim_IsaacGymEnvs_Task_Testing_Iteration

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment