Implementation:Isaac sim IsaacGymEnvs Train Py Task Execution

Knowledge Sources	IsaacGymEnvs Isaac Gym Docs
Type	Wrapper Doc
Domains	Development, Testing
Last Updated	2026-02-15 00:00 GMT

Overview

Concrete entry points and commands for executing training, testing, and debugging of IsaacGymEnvs tasks through both the CLI and programmatic API.

Description

IsaacGymEnvs provides two entry points for running tasks: the CLI via train.py (isaacgymenvs/train.py:L71-215) and the programmatic API via isaacgymenvs.make() (isaacgymenvs/__init__.py:L14-55). Both entry points compose Hydra configurations, look up the task in isaacgym_task_map, instantiate the environment, and either launch rl_games training or return the environment for custom use.

Usage

Use the CLI for standard training and evaluation. Use the programmatic API for custom training loops, integration with other frameworks, or automated experimentation.

Code Reference

Source Location

Repository: NVIDIA-Omniverse/IsaacGymEnvs
CLI entry: isaacgymenvs/train.py (L71-215)
API entry: isaacgymenvs/__init__.py (L14-55)

Entry Point 1: CLI via train.py

Signature

python train.py task=<TaskName> [key=value overrides...]

Core Parameters

Parameter	Type	Default	Description
`task`	string	Required	Task name (must match `isaacgym_task_map` key and `cfg/task/{name}.yaml`)
`num_envs`	int	From task YAML	Number of parallel environments
`headless`	bool	False	Disable rendering (True for server training)
`seed`	int	42	Random seed for reproducibility
`max_iterations`	int	From train YAML	Maximum training epochs
`sim_device`	string	"cuda:0"	Device for physics simulation
`rl_device`	string	"cuda:0"	Device for RL computations
`test`	bool	False	Run in evaluation mode (no training)
`checkpoint`	string	""	Path to checkpoint for resuming or testing
`experiment`	string	""	Experiment name for logging
`multi_gpu`	bool	False	Enable multi-GPU training

Entry Point 2: Programmatic API

Signature

import isaacgymenvs

env = isaacgymenvs.make(
    seed=0,
    task="MyTask",
    num_envs=64,
    sim_device="cuda:0",
    rl_device="cuda:0",
    graphics_device_id=0,
    headless=False,
    multi_gpu=False,
    virtual_screen_capture=False,
    force_render=False,
)

API Usage Example

import isaacgymenvs
import torch

# Create environment
env = isaacgymenvs.make(
    seed=42,
    task="Cartpole",
    num_envs=64,
    sim_device="cuda:0",
    rl_device="cuda:0",
)

# Reset and run
obs = env.reset()
for step in range(1000):
    # Random actions for testing
    actions = torch.randn(env.num_envs, env.num_actions, device=env.rl_device)
    obs, rewards, dones, info = env.step(actions)
    print(f"Step {step}: mean_reward={rewards.mean().item():.3f}, "
          f"num_resets={dones.sum().item()}")

Debugging Commands

Stage 1: Visual Verification

# Run with few environments and rendering for visual inspection
python train.py task=MyTask num_envs=4 headless=False max_iterations=5

# What to look for:
# - Assets appear correctly in the viewer
# - Physics is stable (no explosions, no interpenetration)
# - Environments reset properly
# - Use keyboard: V to toggle viewer, R to reset, Esc to quit

Stage 2: Quick Training Sanity Check

# Short training run to verify reward signals
python train.py task=MyTask num_envs=256 max_iterations=50 headless=True

# What to look for in output:
# - rewards/step should not be constant
# - rewards/step should show some variation or improvement
# - No NaN or Inf errors

Stage 3: Moderate Training Run

# Medium-scale training to verify learning
python train.py task=MyTask num_envs=1024 max_iterations=200 headless=True

# Monitor with TensorBoard:
# tensorboard --logdir runs/MyTask/summaries

Stage 4: Full-Scale Training

# Full training with default parameters from YAML
python train.py task=MyTask headless=True

# Outputs saved to:
# - runs/MyTask/nn/MyTask.pth       (best checkpoint)
# - runs/MyTask/nn/last_MyTask.pth  (last checkpoint)
# - runs/MyTask/summaries/          (TensorBoard logs)

Stage 5: Evaluation

# Evaluate trained policy with rendering
python train.py task=MyTask test=True \
    checkpoint=runs/MyTask/nn/MyTask.pth \
    num_envs=16 headless=False

# What to look for:
# - Agent exhibits desired behavior
# - No reward exploitation or unexpected strategies

Key Debugging Parameters

Parameter	Purpose	Recommended Value for Debugging
`num_envs`	Control parallelism	Start with 4-16 for visual debugging, scale to 256+ for training
`headless=False`	Enable viewer	Use for visual verification and policy evaluation
`max_iterations`	Limit training duration	Use 5-10 for physics testing, 50-100 for reward checking
`seed`	Reproducibility	Fix seed when comparing configurations
`test=True`	Evaluation mode	Use with checkpoint to evaluate trained policy
`checkpoint`	Resume or evaluate	Path to saved `.pth` checkpoint file

What to Monitor During Training

Metric	Location	Healthy Behavior	Warning Signs
Mean reward	TensorBoard: `rewards/step`	Increasing trend	Flat, decreasing, or NaN
Episode length	TensorBoard: `episode_lengths/step`	Increasing (survival) or decreasing (goal-reaching)	Constant at max_episode_length (reward too sparse)
Policy entropy	TensorBoard: `entropy/step`	Gradual decrease	Rapid collapse to 0 (premature convergence)
Value loss	TensorBoard: `losses/value_loss`	Decreasing	Increasing or oscillating wildly
Learning rate	TensorBoard: `info/learning_rate`	Stable or gradually decreasing (adaptive)	Rapid oscillation

I/O Contract

Inputs

Name	Type	Required	Description
Registered task class	`isaacgym_task_map` entry	Yes	Task must be registered in the task map
Task YAML config	`cfg/task/MyTask.yaml`	Yes	Environment parameters
Train YAML config	`cfg/train/MyTaskPPO.yaml`	Yes	Training hyperparameters
CLI overrides	key=value pairs	No	Hydra overrides for any configuration parameter

Outputs

Name	Type	Description
Checkpoints	`.pth` files	Saved policy network weights in `runs/{TaskName}/nn/`
TensorBoard logs	event files	Training metrics in `runs/{TaskName}/summaries/`
Visual verification	Isaac Gym viewer	Real-time rendering when `headless=False`
Console output	Text	Per-epoch reward, episode length, and timing statistics

Related Pages

Isaac_sim_IsaacGymEnvs_Task_Testing_Iteration - implements - Principle defining the iterative testing and refinement cycle.
Isaac_sim_IsaacGymEnvs_Isaacgym_Task_Map_Registration - prerequisite - Task must be registered before it can be executed.
Isaac_sim_IsaacGymEnvs_Hydra_Task_Train_YAML - prerequisite - YAML configs must exist for the task.
Isaac_sim_IsaacGymEnvs_Task_Design_Specification - feedback - Testing results inform design specification refinements.

Principle:Isaac_sim_IsaacGymEnvs_Task_Testing_Iteration

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment