Principle:Haosulab ManiSkill Sim2Real Bridge

Field	Value
Principle Name	Sim2Real Bridge
Domain	Sim2Real
Overview	Bridging simulation and physical robot through a unified environment interface
Date	2026-02-15
Repository	Haosulab/ManiSkill

Overview

The Sim2Real Bridge principle describes how ManiSkill provides a unified Gymnasium-compatible environment interface that can seamlessly switch between simulation physics and real hardware I/O. This abstraction ensures that a trained policy can be deployed on a physical robot without modifying the inference code -- the same env.step(action) and env.reset() calls work identically whether backed by a physics simulator or by real motors and cameras.

Description

The bridge operates by wrapping a simulation environment and a real robot agent together into a single Sim2RealEnv object. The key design decisions are:

Observation Space Alignment

The bridge ensures that observations returned in the real environment have the same structure, shape, and data type as those from simulation:

Proprioception: Joint positions and velocities are read from the real robot's motors and formatted identically to the simulated robot's qpos and qvel.
Visual data: Camera images from real sensors are center-cropped and resized to match the simulation camera resolution, then structured in the same dictionary format (rgb, depth).
Extra observations: If the simulation environment computes additional features (e.g., TCP pose), the bridge delegates to the simulation's _get_obs_extra method, which uses forward kinematics from the (synced) simulated robot.

Action Space Alignment

The bridge reuses the simulation's controller to compute joint targets from high-level actions:

The action is passed to the simulation agent's controller (e.g., pd_joint_pos).
The controller computes the desired drive targets or velocities.
These targets are sent to the real robot's motors.

This ensures that the same action semantics (e.g., delta joint positions) are respected in both domains.

Control Frequency Enforcement

The bridge enforces the simulation's control frequency on the real robot by measuring elapsed wall-clock time between control steps and sleeping if the step completes faster than the control period. If a step takes longer than the control period, a warning is logged.

Wrapper Compatibility

Any Gymnasium wrappers applied to the simulation environment (e.g., observation normalization, action scaling) are automatically applied in the real environment too. The bridge achieves this by temporarily swapping the innermost wrapper's environment reference to point to a RealEnvStepReset object that delegates to real hardware.

Data Validation

On initialization (unless skip_data_checks=True), the bridge runs both the simulation and real environments once and compares the observations recursively to verify matching shapes and data types.

Usage

from mani_skill.envs.sim2real_env import Sim2RealEnv

sim2real_env = Sim2RealEnv(
    sim_env=sim_env,          # The wrapped simulation environment
    agent=real_agent,          # A BaseRealAgent instance controlling real hardware
    control_freq=30,           # Control frequency in Hz
)
obs, info = sim2real_env.reset()
for _ in range(100):
    action = policy(obs)
    obs, reward, terminated, truncated, info = sim2real_env.step(action)
sim2real_env.close()

Theoretical Basis

Sim-to-real transfer: The bridge is the runtime component that makes zero-shot sim-to-real transfer possible. By ensuring identical observation and action spaces, the trained policy perceives the real world through the same "lens" it was trained with.

Environment abstraction: The Gymnasium interface (reset, step, observation_space, action_space) provides a clean abstraction boundary. The bridge implements this interface for real hardware, making the execution backend (simulation vs. real) an implementation detail invisible to the policy.

Sensor preprocessing: Center-cropping and resizing real camera images to match simulation resolution is a standard practice in visual sim2real, minimizing distribution shift from resolution differences while preserving the spatial layout of the scene.

Controller reuse: By reusing the simulation's PD controller to compute joint targets, the bridge avoids reimplementing control logic for the real robot and ensures that control dynamics (gains, limits, interpolation) are consistent.

Real-time control: Enforcing the control frequency is critical for real robot safety and for matching the temporal dynamics that the policy was trained with. Deviations in control frequency can lead to different effective gains and motion behaviors.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment