Workflow:Isaac sim IsaacGymEnvs Factory Assembly Training

Knowledge Sources	IsaacGymEnvs Factory RSS 2022 IndustReal RSS 2023
Domains	Robotic_Assembly, Reinforcement_Learning, Sim_to_Real
Last Updated	2026-02-15 09:00 GMT

Overview

End-to-end process for training contact-rich robotic assembly policies using Factory and IndustReal environments with Franka robot manipulation in Isaac Gym.

Description

This workflow covers training RL policies for precision assembly tasks such as nut-bolt pick-place-screw and peg/gear insertion using the Factory and IndustReal environment hierarchies. These environments feature a Franka Panda robot interacting with small parts on a tabletop, using SDF-based collisions for accurate contact modeling. The Factory system uses a three-tier class hierarchy (Base → Env → Task) and provides multiple low-level controller types (joint-space IK, task-space impedance). IndustReal extends Factory with three novel sim-to-real algorithms: Simulation-based Policy Update (SAPU), SDF-Based Reward, and Sampling-Based Curriculum (SBC).

Key capabilities:

SDF collision detection for accurate contact-rich manipulation
Hierarchical Base → Env → Task architecture for modular environment design
Seven controller types including task-space impedance control
IndustReal SAPU, SDF-Based Reward, and SBC for sim-to-real transfer
Sequential sub-policy training (Pick → Place → Screw)

Usage

Execute this workflow when you need to train policies for precision robotic assembly tasks involving contact-rich manipulation. This applies to nut-bolt assembly (pick, place, screw sub-tasks), peg insertion, and gear insertion. You need sufficient GPU memory for SDF generation and contact resolution across parallel environments (128 environments recommended).

Execution Steps

Step 1: Asset Preparation and SDF Generation

The first time Factory or IndustReal tasks are run, Isaac Gym generates Signed Distance Fields (SDFs) for all assembly assets (nuts, bolts, pegs, gears, sockets). These SDFs are cached for subsequent runs. Asset dimensions and properties are defined in YAML files under assets/factory/yaml/ and assets/industreal/yaml/.

Key considerations:

SDF generation can be time-consuming on first run but is cached afterward
SDF resolution (typically 256-512 for small parts) is set in the URDF <sdf> element
If Isaac Gym is terminated during SDF generation, the cache may corrupt and need clearing
Asset YAML files define precise dimensions used for reward computation and initialization

Step 2: Configuration of Base, Environment, and Task

The hierarchical config system loads three levels of YAML configuration: FactoryBase.yaml (or IndustRealBase.yaml) for physics parameters and Franka setup, the environment config for specific asset definitions, and the task config for RL-specific settings including controller type and reward formulation.

What happens:

Base config sets simulation parameters (timestep, substeps, solver iterations, contact offsets)
Environment config specifies which assembly assets to load and their randomization ranges
Task config defines controller type, observation composition, reward terms, and success criteria
Training config (PPO.yaml) specifies network architecture and PPO hyperparameters

Step 3: Scene Initialization with Contact Setup

The Factory base class initializes the Franka robot and table, then the environment class adds the assembly-specific assets (nuts/bolts, pegs/holes, or gears/shafts). Collision filtering is configured to enable SDF-mesh contacts between small parts while avoiding unnecessary contact computations.

What happens:

Franka and table actors are created with appropriate collision groups
Assembly asset actors are created per the environment config
SDF-mesh collision pairs are established based on URDF <sdf> tags
GPU contact buffers are allocated with sufficient capacity for dense contacts

Step 4: Controller Configuration

The task-level config specifies a high-level controller type which is parsed into low-level controller parameters. The factory_control module converts controller targets (from RL actions) into joint torques using Jacobian computation, with options for joint-space or task-space control with or without inertial compensation.

Key considerations:

Controller types range from simple joint PD to full operational-space impedance
Controller gains significantly affect training stability and sim-to-real transfer
IndustReal removes artificial dissipative terms from the Franka URDF for better transfer

Step 5: Sub-policy Training

For the nut-bolt pipeline, policies are trained sequentially for each sub-task (Pick, Place, Screw). Each sub-task defines its own observation space, reward function, and reset conditions. For IndustReal tasks, SAPU periodically switches between simulation-aware and simulation-free policy updates, SDF-Based Reward provides dense geometric feedback, and SBC gradually increases task difficulty.

What happens:

The RL agent trains via PPO on the configured sub-task
Actions are applied as controller targets, converted to joint torques each timestep
Rewards are computed from task-specific criteria (e.g., distance to grasp pose, insertion depth)
Environments are reset upon success or timeout with randomized initial conditions
IndustReal tasks use SAPU to alternate between physics-enabled and physics-disabled updates

Step 6: Evaluation and Sim-to-Real Preparation

After training, policies are evaluated for success rate and behavioral quality. For sim-to-real transfer, the trained checkpoint is tested with domain randomization to verify robustness. The IndustReal Franka URDF corrections ensure the simulation dynamics closely match the real robot.

Key considerations:

Pick and Place sub-policies may need an hour of training for high success rates
Screw sub-policies converge quickly without initial state randomization
IndustReal peg insertion needs 8-10 hours; gear insertion needs 18-20 hours
Testing with 5 random seeds and selecting the best is recommended for IndustReal

Execution Diagram

GitHub URL

Workflow Repository