Implementation:Isaac sim IsaacGymEnvs IndustReal Sim to Real
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Sim_to_Real |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Evaluation wrapper and control signal methods for running trained IndustReal assembly policies in simulation with real-robot-compatible control interfaces.
Description
This implementation covers two layers: the top-level evaluation entry point via train.py with play mode, and the IndustRealBase methods that generate hardware-compatible control signals and step the simulation. The generate_ctrl_signals() method converts task-space targets from the RL policy into joint-space commands using the same controller interface that would run on a real Franka robot. The simulate_and_refresh() method steps the physics simulation and refreshes all GPU tensor handles for the next observation.
Usage
Use this implementation to evaluate a trained checkpoint in simulation before deploying to a real robot. The evaluation is launched via the standard train.py entry point with test=True and a checkpoint path.
Code Reference
Source Location
- Repository: IsaacGymEnvs
- File: isaacgymenvs/train.py, Lines 210-215
- File: isaacgymenvs/tasks/industreal/industreal_base.py, Lines 353-408
Key Methods
Evaluation Entry Point
# In train.py (Lines 210-215)
# When test=True, the runner loads the checkpoint and runs in play mode:
runner.run({"train": False, "play": True, "checkpoint": checkpoint_path})
# This triggers:
# 1. Policy loaded from checkpoint (deterministic mode)
# 2. Environment stepped with policy actions
# 3. Metrics collected (success rate, insertion depth)
Control Signal Generation
class IndustRealBase(FactoryBase, FactoryABCBase):
def generate_ctrl_signals(self): # L353
"""Generate hardware-compatible control signals from RL actions.
Converts task-space fingertip targets to joint-space commands
using the configured controller type (IK, impedance, etc.).
Updates:
self.dof_pos_target or self.dof_torque depending on controller type.
self.ctrl_target_gripper_dof_pos for gripper commands.
"""
def simulate_and_refresh(self): # L410
"""Step physics simulation and refresh GPU tensor handles.
Performs:
1. gym.simulate(sim) - advance physics
2. gym.fetch_results(sim) - synchronize GPU
3. gym.refresh_*_tensor() - update all state tensors
(dof_state, rigid_body_state, contact_force, jacobian, mass_matrix)
"""
Import
# Evaluation is launched via command line, not direct import:
python train.py task=IndustRealTaskPegsInsert test=True \
checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| checkpoint | str (file path) | Yes | Path to trained policy checkpoint (.pth file)
|
| task config | YAML | Yes | Task configuration (same as training, may override num_envs) |
| test | bool | Yes | Must be True to enable evaluation mode
|
| self.ctrl_target_fingertip_midpoint_pos | torch.Tensor | Yes | Target fingertip position from policy [num_envs, 3]
|
| self.ctrl_target_fingertip_midpoint_quat | torch.Tensor | Yes | Target fingertip quaternion from policy [num_envs, 4]
|
Outputs
| Name | Type | Description |
|---|---|---|
| Success rate | float | Fraction of environments achieving successful engagement/insertion |
| Insertion depth | float | Average depth of plug insertion into socket (meters) |
| self.dof_pos_target | torch.Tensor | Joint position targets sent to simulation [num_envs, num_dofs]
|
| self.dof_torque | torch.Tensor | Joint torques sent to simulation [num_envs, num_dofs]
|
| Updated state tensors | torch.Tensor | All GPU tensors refreshed after simulation step |
Usage Examples
Running Evaluation
# Evaluate a trained IndustReal insertion policy
python train.py task=IndustRealTaskPegsInsert \
test=True \
checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth \
num_envs=64 \
headless=False
# Headless evaluation with more environments for statistics
python train.py task=IndustRealTaskPegsInsert \
test=True \
checkpoint=runs/IndustRealTaskPegsInsert/nn/last.pth \
num_envs=4096 \
headless=True
Control Signal Flow in Evaluation
# The evaluation loop (inside the runner) calls these methods each step:
# 1. Policy forward pass (deterministic)
actions = policy.act(observations) # [num_envs, 6]
# 2. pre_physics_step converts actions to controller targets
task.pre_physics_step(actions)
# -> Scales actions by pos_action_scale, rot_action_scale
# -> Sets self.ctrl_target_fingertip_midpoint_pos/quat
# 3. generate_ctrl_signals converts targets to joint commands
task.generate_ctrl_signals()
# -> Calls factory_control.compute_dof_pos_target() or compute_dof_torque()
# -> Sets self.dof_pos_target or self.dof_torque
# 4. simulate_and_refresh steps physics and updates tensors
task.simulate_and_refresh()
# -> gym.simulate(sim)
# -> gym.refresh_dof_state_tensor(sim)
# -> gym.refresh_rigid_body_state_tensor(sim)
# -> gym.refresh_net_contact_force_tensor(sim)
# 5. compute_observations builds next observation
task.compute_observations()
# 6. compute_reward tracks success metrics
task.compute_reward()