Implementation:Haosulab ManiSkill Sim2Real Training Wrapper
| Field | Value |
|---|---|
| Implementation Name | Sim2Real Training Wrapper |
| Type | Wrapper Doc |
| Domain | Sim2Real |
| Date | 2026-02-15 |
| Repository | Haosulab/ManiSkill |
Overview
The Sim2Real Training Wrapper describes the pattern for using ManiSkill's digital twin environments with standard training pipelines (PPO, behavioral cloning, Diffusion Policy) to produce policies suitable for real-world deployment. Rather than being a single class, this is a composition pattern that combines existing components.
Description
The sim2real training pattern differs from standard ManiSkill training in three ways:
- Environment selection: Instead of using a standard environment (e.g.,
PickCube-v1), a digital twin variant is used (e.g.,GraspCubeSO100Digital-v1). These environments inherit fromBaseDigitalTwinEnvand include greenscreen compositing and domain randomization.
- Observation mode: The observation mode must include visual data (
rgb) because real-world deployment relies on camera observations. Privileged state information (object poses, contact forces) is not available on real hardware and should not be used.
- Training algorithm: Any standard algorithm can be used. The digital twin environment is a drop-in replacement for the standard environment in existing training scripts.
Usage
The training script is identical to standard training, with only the environment ID changed:
import gymnasium as gym
# Standard RL training setup, but with a digital twin environment
env = gym.make(
"GraspCubeSO100Digital-v1", # Digital twin variant
obs_mode="rgb", # Visual observations for real-world transfer
control_mode="pd_joint_pos",
num_envs=256,
sim_backend="gpu",
)
# Apply standard wrappers
from mani_skill.utils.wrappers import CPUGymWrapper
# ... additional wrappers as needed ...
# Train with PPO, BC, Diffusion Policy, etc.
# The training loop is unchanged from standard training.
Code Reference
Environment Configuration Pattern
The key configuration choices for sim2real training:
import gymnasium as gym
# For RL training (PPO):
env = gym.make(
"GraspCubeSO100Digital-v1",
obs_mode="rgb",
control_mode="pd_joint_pos",
num_envs=1024, # Large batch for GPU-parallel PPO
sim_backend="gpu",
render_mode="sensors",
)
# For imitation learning (BC/Diffusion Policy):
# First, collect demonstrations in the digital twin environment
env = gym.make(
"GraspCubeSO100Digital-v1",
obs_mode="rgb",
control_mode="pd_joint_pos",
sim_backend="cpu", # CPU for motion planning demo collection
render_mode="rgb_array",
)
Cross-References to Other Workflows
The sim2real training pattern reuses these existing workflows:
| Workflow | Usage in Sim2Real |
|---|---|
| RL Training with PPO | Train visual policy on digital twin environment with domain randomization |
| Imitation Learning Pipeline | Collect demonstrations via motion planning or teleoperation in digital twin, then train BC/Diffusion Policy |
| Motion Planning Demo Generation | Generate expert demonstrations in the digital twin environment for imitation learning |
I/O Contract
| Direction | Data | Format |
|---|---|---|
| Input (training) | Digital twin environment | Gymnasium environment with obs_mode="rgb"
|
| Input (IL) | Demonstration trajectories | HDF5 files from digital twin environment |
| Output | Trained policy checkpoint | PyTorch model weights (format depends on training framework) |
| Deployment | Policy + Sim2RealEnv | See Implementation:Haosulab_ManiSkill_Sim2RealEnv |
Key Differences from Standard Training
| Aspect | Standard Training | Sim2Real Training |
|---|---|---|
| Environment ID | PickCube-v1 |
GraspCubeSO100Digital-v1
|
| Observation mode | state (privileged) |
rgb (visual only)
|
| Background | Simulated | Greenscreened real image |
| Domain randomization | Optional | Strongly recommended |
| Target deployment | Simulation evaluation | Real robot hardware |
Usage Examples
# Complete sim2real training workflow example
# Step 1: Generate demonstrations in digital twin
import subprocess
subprocess.run([
"python", "-m", "mani_skill.examples.motionplanning.panda.run",
"-e", "GraspCubeSO100Digital-v1",
"-n", "100",
"--record-dir", "demos",
])
# Step 2: Train imitation learning policy
# (Using your preferred IL framework with the generated demonstrations)
# Step 3: Deploy on real robot
from mani_skill.envs.sim2real_env import Sim2RealEnv
sim_env = gym.make("GraspCubeSO100Digital-v1", obs_mode="rgb",
control_mode="pd_joint_pos")
real_env = Sim2RealEnv(sim_env=sim_env, agent=real_agent)
obs, info = real_env.reset()
while True:
action = trained_policy(obs)
obs, reward, terminated, truncated, info = real_env.step(action)
if terminated or truncated:
break
real_env.close()