Implementation:Haosulab ManiSkill Sim2Real Training Wrapper

Field	Value
Implementation Name	Sim2Real Training Wrapper
Type	Wrapper Doc
Domain	Sim2Real
Date	2026-02-15
Repository	Haosulab/ManiSkill

Overview

The Sim2Real Training Wrapper describes the pattern for using ManiSkill's digital twin environments with standard training pipelines (PPO, behavioral cloning, Diffusion Policy) to produce policies suitable for real-world deployment. Rather than being a single class, this is a composition pattern that combines existing components.

Description

The sim2real training pattern differs from standard ManiSkill training in three ways:

Environment selection: Instead of using a standard environment (e.g., PickCube-v1), a digital twin variant is used (e.g., GraspCubeSO100Digital-v1). These environments inherit from BaseDigitalTwinEnv and include greenscreen compositing and domain randomization.

Observation mode: The observation mode must include visual data (rgb) because real-world deployment relies on camera observations. Privileged state information (object poses, contact forces) is not available on real hardware and should not be used.

Training algorithm: Any standard algorithm can be used. The digital twin environment is a drop-in replacement for the standard environment in existing training scripts.

Usage

The training script is identical to standard training, with only the environment ID changed:

import gymnasium as gym

# Standard RL training setup, but with a digital twin environment
env = gym.make(
    "GraspCubeSO100Digital-v1",  # Digital twin variant
    obs_mode="rgb",                # Visual observations for real-world transfer
    control_mode="pd_joint_pos",
    num_envs=256,
    sim_backend="gpu",
)

# Apply standard wrappers
from mani_skill.utils.wrappers import CPUGymWrapper
# ... additional wrappers as needed ...

# Train with PPO, BC, Diffusion Policy, etc.
# The training loop is unchanged from standard training.

Code Reference

Environment Configuration Pattern

The key configuration choices for sim2real training:

import gymnasium as gym

# For RL training (PPO):
env = gym.make(
    "GraspCubeSO100Digital-v1",
    obs_mode="rgb",
    control_mode="pd_joint_pos",
    num_envs=1024,           # Large batch for GPU-parallel PPO
    sim_backend="gpu",
    render_mode="sensors",
)

# For imitation learning (BC/Diffusion Policy):
# First, collect demonstrations in the digital twin environment
env = gym.make(
    "GraspCubeSO100Digital-v1",
    obs_mode="rgb",
    control_mode="pd_joint_pos",
    sim_backend="cpu",       # CPU for motion planning demo collection
    render_mode="rgb_array",
)

Cross-References to Other Workflows

The sim2real training pattern reuses these existing workflows:

Workflow	Usage in Sim2Real
RL Training with PPO	Train visual policy on digital twin environment with domain randomization
Imitation Learning Pipeline	Collect demonstrations via motion planning or teleoperation in digital twin, then train BC/Diffusion Policy
Motion Planning Demo Generation	Generate expert demonstrations in the digital twin environment for imitation learning

I/O Contract

Direction	Data	Format
Input (training)	Digital twin environment	Gymnasium environment with `obs_mode="rgb"`
Input (IL)	Demonstration trajectories	HDF5 files from digital twin environment
Output	Trained policy checkpoint	PyTorch model weights (format depends on training framework)
Deployment	Policy + Sim2RealEnv	See Implementation:Haosulab_ManiSkill_Sim2RealEnv

Key Differences from Standard Training

Aspect	Standard Training	Sim2Real Training
Environment ID	`PickCube-v1`	`GraspCubeSO100Digital-v1`
Observation mode	`state` (privileged)	`rgb` (visual only)
Background	Simulated	Greenscreened real image
Domain randomization	Optional	Strongly recommended
Target deployment	Simulation evaluation	Real robot hardware

Usage Examples

# Complete sim2real training workflow example

# Step 1: Generate demonstrations in digital twin
import subprocess
subprocess.run([
    "python", "-m", "mani_skill.examples.motionplanning.panda.run",
    "-e", "GraspCubeSO100Digital-v1",
    "-n", "100",
    "--record-dir", "demos",
])

# Step 2: Train imitation learning policy
# (Using your preferred IL framework with the generated demonstrations)

# Step 3: Deploy on real robot
from mani_skill.envs.sim2real_env import Sim2RealEnv

sim_env = gym.make("GraspCubeSO100Digital-v1", obs_mode="rgb",
                    control_mode="pd_joint_pos")
real_env = Sim2RealEnv(sim_env=sim_env, agent=real_agent)

obs, info = real_env.reset()
while True:
    action = trained_policy(obs)
    obs, reward, terminated, truncated, info = real_env.step(action)
    if terminated or truncated:
        break
real_env.close()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment