Implementation:Isaac sim IsaacGymEnvs IndustReal Sim to Real

Knowledge Sources	IndustReal IsaacGymEnvs
Domains	Evaluation, Sim_to_Real
Last Updated	2026-02-15 00:00 GMT

Overview

Evaluation wrapper and control signal methods for running trained IndustReal assembly policies in simulation with real-robot-compatible control interfaces.

Description

This implementation covers two layers: the top-level evaluation entry point via train.py with play mode, and the IndustRealBase methods that generate hardware-compatible control signals and step the simulation. The generate_ctrl_signals() method converts task-space targets from the RL policy into joint-space commands using the same controller interface that would run on a real Franka robot. The simulate_and_refresh() method steps the physics simulation and refreshes all GPU tensor handles for the next observation.

Usage

Use this implementation to evaluate a trained checkpoint in simulation before deploying to a real robot. The evaluation is launched via the standard train.py entry point with test=True and a checkpoint path.

Code Reference

Source Location

Repository: IsaacGymEnvs
File: isaacgymenvs/train.py, Lines 210-215
File: isaacgymenvs/tasks/industreal/industreal_base.py, Lines 353-408

Key Methods

Evaluation Entry Point

# In train.py (Lines 210-215)
# When test=True, the runner loads the checkpoint and runs in play mode:
runner.run({"train": False, "play": True, "checkpoint": checkpoint_path})

# This triggers:
# 1. Policy loaded from checkpoint (deterministic mode)
# 2. Environment stepped with policy actions
# 3. Metrics collected (success rate, insertion depth)

Control Signal Generation

class IndustRealBase(FactoryBase, FactoryABCBase):

    def generate_ctrl_signals(self):  # L353
        """Generate hardware-compatible control signals from RL actions.

        Converts task-space fingertip targets to joint-space commands
        using the configured controller type (IK, impedance, etc.).

        Updates:
            self.dof_pos_target or self.dof_torque depending on controller type.
            self.ctrl_target_gripper_dof_pos for gripper commands.
        """

    def simulate_and_refresh(self):  # L410
        """Step physics simulation and refresh GPU tensor handles.

        Performs:
            1. gym.simulate(sim) - advance physics
            2. gym.fetch_results(sim) - synchronize GPU
            3. gym.refresh_*_tensor() - update all state tensors
               (dof_state, rigid_body_state, contact_force, jacobian, mass_matrix)
        """

Import

# Evaluation is launched via command line, not direct import:
python train.py task=IndustRealTaskPegsInsert test=True \
    checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth

I/O Contract

Inputs

Name	Type	Required	Description
checkpoint	str (file path)	Yes	Path to trained policy checkpoint (`.pth` file)
task config	YAML	Yes	Task configuration (same as training, may override num_envs)
test	bool	Yes	Must be `True` to enable evaluation mode
self.ctrl_target_fingertip_midpoint_pos	torch.Tensor	Yes	Target fingertip position from policy `[num_envs, 3]`
self.ctrl_target_fingertip_midpoint_quat	torch.Tensor	Yes	Target fingertip quaternion from policy `[num_envs, 4]`

Outputs

Name	Type	Description
Success rate	float	Fraction of environments achieving successful engagement/insertion
Insertion depth	float	Average depth of plug insertion into socket (meters)
self.dof_pos_target	torch.Tensor	Joint position targets sent to simulation `[num_envs, num_dofs]`
self.dof_torque	torch.Tensor	Joint torques sent to simulation `[num_envs, num_dofs]`
Updated state tensors	torch.Tensor	All GPU tensors refreshed after simulation step

Usage Examples

Running Evaluation

# Evaluate a trained IndustReal insertion policy
python train.py task=IndustRealTaskPegsInsert \
    test=True \
    checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth \
    num_envs=64 \
    headless=False

# Headless evaluation with more environments for statistics
python train.py task=IndustRealTaskPegsInsert \
    test=True \
    checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth \
    num_envs=4096 \
    headless=True

Control Signal Flow in Evaluation

# The evaluation loop (inside the runner) calls these methods each step:

# 1. Policy forward pass (deterministic)
actions = policy.act(observations)  # [num_envs, 6]

# 2. pre_physics_step converts actions to controller targets
task.pre_physics_step(actions)
# -> Scales actions by pos_action_scale, rot_action_scale
# -> Sets self.ctrl_target_fingertip_midpoint_pos/quat

# 3. generate_ctrl_signals converts targets to joint commands
task.generate_ctrl_signals()
# -> Calls factory_control.compute_dof_pos_target() or compute_dof_torque()
# -> Sets self.dof_pos_target or self.dof_torque

# 4. simulate_and_refresh steps physics and updates tensors
task.simulate_and_refresh()
# -> gym.simulate(sim)
# -> gym.refresh_dof_state_tensor(sim)
# -> gym.refresh_rigid_body_state_tensor(sim)
# -> gym.refresh_net_contact_force_tensor(sim)

# 5. compute_observations builds next observation
task.compute_observations()

# 6. compute_reward tracks success metrics
task.compute_reward()

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment