Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Isaac sim IsaacGymEnvs Factory Assembly Training

From Leeroopedia
Knowledge Sources
Domains Robotic_Assembly, Reinforcement_Learning, Sim_to_Real
Last Updated 2026-02-15 09:00 GMT

Overview

End-to-end process for training contact-rich robotic assembly policies using Factory and IndustReal environments with Franka robot manipulation in Isaac Gym.

Description

This workflow covers training RL policies for precision assembly tasks such as nut-bolt pick-place-screw and peg/gear insertion using the Factory and IndustReal environment hierarchies. These environments feature a Franka Panda robot interacting with small parts on a tabletop, using SDF-based collisions for accurate contact modeling. The Factory system uses a three-tier class hierarchy (Base → Env → Task) and provides multiple low-level controller types (joint-space IK, task-space impedance). IndustReal extends Factory with three novel sim-to-real algorithms: Simulation-based Policy Update (SAPU), SDF-Based Reward, and Sampling-Based Curriculum (SBC).

Key capabilities:

  • SDF collision detection for accurate contact-rich manipulation
  • Hierarchical Base → Env → Task architecture for modular environment design
  • Seven controller types including task-space impedance control
  • IndustReal SAPU, SDF-Based Reward, and SBC for sim-to-real transfer
  • Sequential sub-policy training (Pick → Place → Screw)

Usage

Execute this workflow when you need to train policies for precision robotic assembly tasks involving contact-rich manipulation. This applies to nut-bolt assembly (pick, place, screw sub-tasks), peg insertion, and gear insertion. You need sufficient GPU memory for SDF generation and contact resolution across parallel environments (128 environments recommended).

Execution Steps

Step 1: Asset Preparation and SDF Generation

The first time Factory or IndustReal tasks are run, Isaac Gym generates Signed Distance Fields (SDFs) for all assembly assets (nuts, bolts, pegs, gears, sockets). These SDFs are cached for subsequent runs. Asset dimensions and properties are defined in YAML files under assets/factory/yaml/ and assets/industreal/yaml/.

Key considerations:

  • SDF generation can be time-consuming on first run but is cached afterward
  • SDF resolution (typically 256-512 for small parts) is set in the URDF <sdf> element
  • If Isaac Gym is terminated during SDF generation, the cache may corrupt and need clearing
  • Asset YAML files define precise dimensions used for reward computation and initialization

Step 2: Configuration of Base, Environment, and Task

The hierarchical config system loads three levels of YAML configuration: FactoryBase.yaml (or IndustRealBase.yaml) for physics parameters and Franka setup, the environment config for specific asset definitions, and the task config for RL-specific settings including controller type and reward formulation.

What happens:

  • Base config sets simulation parameters (timestep, substeps, solver iterations, contact offsets)
  • Environment config specifies which assembly assets to load and their randomization ranges
  • Task config defines controller type, observation composition, reward terms, and success criteria
  • Training config (PPO.yaml) specifies network architecture and PPO hyperparameters

Step 3: Scene Initialization with Contact Setup

The Factory base class initializes the Franka robot and table, then the environment class adds the assembly-specific assets (nuts/bolts, pegs/holes, or gears/shafts). Collision filtering is configured to enable SDF-mesh contacts between small parts while avoiding unnecessary contact computations.

What happens:

  • Franka and table actors are created with appropriate collision groups
  • Assembly asset actors are created per the environment config
  • SDF-mesh collision pairs are established based on URDF <sdf> tags
  • GPU contact buffers are allocated with sufficient capacity for dense contacts

Step 4: Controller Configuration

The task-level config specifies a high-level controller type which is parsed into low-level controller parameters. The factory_control module converts controller targets (from RL actions) into joint torques using Jacobian computation, with options for joint-space or task-space control with or without inertial compensation.

Key considerations:

  • Controller types range from simple joint PD to full operational-space impedance
  • Controller gains significantly affect training stability and sim-to-real transfer
  • IndustReal removes artificial dissipative terms from the Franka URDF for better transfer

Step 5: Sub-policy Training

For the nut-bolt pipeline, policies are trained sequentially for each sub-task (Pick, Place, Screw). Each sub-task defines its own observation space, reward function, and reset conditions. For IndustReal tasks, SAPU periodically switches between simulation-aware and simulation-free policy updates, SDF-Based Reward provides dense geometric feedback, and SBC gradually increases task difficulty.

What happens:

  • The RL agent trains via PPO on the configured sub-task
  • Actions are applied as controller targets, converted to joint torques each timestep
  • Rewards are computed from task-specific criteria (e.g., distance to grasp pose, insertion depth)
  • Environments are reset upon success or timeout with randomized initial conditions
  • IndustReal tasks use SAPU to alternate between physics-enabled and physics-disabled updates

Step 6: Evaluation and Sim-to-Real Preparation

After training, policies are evaluated for success rate and behavioral quality. For sim-to-real transfer, the trained checkpoint is tested with domain randomization to verify robustness. The IndustReal Franka URDF corrections ensure the simulation dynamics closely match the real robot.

Key considerations:

  • Pick and Place sub-policies may need an hour of training for high success rates
  • Screw sub-policies converge quickly without initial state randomization
  • IndustReal peg insertion needs 8-10 hours; gear insertion needs 18-20 hours
  • Testing with 5 random seeds and selecting the best is recommended for IndustReal

Execution Diagram

GitHub URL

Workflow Repository