Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Isaac sim IsaacGymEnvs PBT NGC Backend

From Leeroopedia
Knowledge Sources
Domains Cloud_Computing, Distributed_Training
Last Updated 2026-02-15 11:00 GMT

Overview

PBT_NGC_Backend provides the NVIDIA GPU Cloud (NGC) backend for the PBT experiment launcher, submitting experiment jobs to NGC via command-line templates.

Description

This module implements NGC-based distributed experiment execution through two functions. The add_ngc_args() function extends the argument parser with NGC-specific options: --ngc_job_template (path to an NGC command-line template file) and --ngc_print_only (a boolean flag to preview commands without submitting). It also sets the default pause_between to 0, since NGC job submissions do not require pauses between them.

The run_ngc() function performs the actual job submission. It reads the NGC template file, normalizes whitespace, and generates all experiment configurations from the RunDescription. For each experiment, it substitutes the job name into Template:Name and the experiment command into Template:Experiment cmd within the template. Unless ngc_print_only is set, each job is submitted by spawning a shell subprocess with the fully resolved NGC command. Jobs are launched via a ThreadPool, with pool size determined by the pause setting: sequential execution (pool size 1) when pauses are enabled, or parallel submission (up to 10 threads) when pauses are disabled.

This backend enables researchers to scale PBT experiments to cloud GPU instances without modifying their experiment definitions. The NGC template approach provides full flexibility over instance types, Docker containers, workspace mounts, and other NGC-specific configuration.

Usage

Use this backend when deploying PBT experiments to NVIDIA GPU Cloud. Prepare an NGC command-line template file with Template:Name and Template:Experiment cmd placeholders, then invoke the launcher with --backend ngc --ngc_job_template path/to/template.txt. Use --ngc_print_only true to preview the generated commands before submitting.

Code Reference

Source Location

Signature

def add_ngc_args(parser):
def run_ngc(run_description, args):

Import

from isaacgymenvs.pbt.launcher.run_ngc import run_ngc, add_ngc_args

I/O Contract

Inputs

Name Type Required Description
run_description RunDescription Yes The experiment run description containing all experiments and parameter combinations
args argparse.Namespace Yes Parsed arguments including ngc_job_template, ngc_print_only, train_dir, and pause_between
--ngc_job_template str Yes Path to a text file containing the NGC CLI command template with Template:Name and Template:Experiment cmd placeholders
--ngc_print_only bool No If True, only print generated commands without executing (default: False)

Outputs

Name Type Description
return code int 0 on successful completion of all job submissions
stdout str Printed NGC commands and job submission output for each experiment

Usage Examples

# NGC template file (ngc_template.txt):
# ngc batch run --name "{{ name }}" --instance dgxa100.80g.1.norm
#   --image "nvcr.io/nvidia/isaac-sim:latest"
#   --commandline "{{ experiment_cmd }}"
#   --workspace my_workspace:/workspace:RW

# Launch with NGC backend:
# python -m isaacgymenvs.pbt.launcher.run \
#     --run isaacgymenvs.pbt.experiments.my_experiment \
#     --backend ngc \
#     --ngc_job_template ngc_template.txt \
#     --train_dir /workspace/train_dir

# Preview commands without submitting:
# python -m isaacgymenvs.pbt.launcher.run \
#     --run isaacgymenvs.pbt.experiments.my_experiment \
#     --backend ngc \
#     --ngc_job_template ngc_template.txt \
#     --ngc_print_only true

# Programmatic usage:
from isaacgymenvs.pbt.launcher.run_ngc import run_ngc, add_ngc_args
run_ngc(run_description, args)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment