Implementation:Haosulab ManiSkill CPUGymWrapper

Knowledge Sources	Haosulab_ManiSkill
Domains	Robotics, Simulation, Gymnasium Integration
Last Updated	2026-02-15 08:00 GMT

Overview

Concrete tool for converting ManiSkill's batched torch tensor outputs to unbatched numpy arrays conforming to the standard Gymnasium API.

Description

The CPUGymWrapper is a Gymnasium wrapper that ensures ManiSkill environment outputs (from step(), reset(), and render()) fully conform to the standard Gymnasium API by converting batched torch tensors to unbatched numpy arrays.

Key behavior:

step(): Converts action to numpy, runs the underlying step, converts outputs to numpy, and unbatches (removes the leading batch dimension of size 1).
reset(): Runs reset, converts to numpy, and unbatches.
render(): Returns unbatched numpy images for rgb_array/sensors/all render modes.
Sets observation_space and action_space to the single (unbatched) versions.

Assertions: Only works with single-environment setups (num_envs == 1) on the CPU backend.

Optional metric recording (when record_metrics=True):

Tracks cumulative return, episode length, average reward per step.
Tracks success_once and fail_once flags (set once and kept True).
When ignore_terminations=True, also records success_at_end and fail_at_end.
Metrics are stored in info["episode"].

ignore_terminations: When True, the wrapper suppresses termination signals, allowing the episode to continue until truncation. Useful for evaluating success rates at the end of fixed-length episodes.

Usage

Apply as the final wrapper when using ManiSkill with standard RL frameworks that expect Gymnasium-compliant numpy outputs. Should be applied after all other ManiSkill-specific wrappers.

Code Reference

Source Location

Repository: Haosulab_ManiSkill
File: mani_skill/utils/wrappers/gymnasium.py

Signature

class CPUGymWrapper(gym.Wrapper):
    def __init__(
        self,
        env: gym.Env,
        ignore_terminations: bool = False,
        record_metrics: bool = False,
    ): ...
    def step(self, action) -> tuple: ...
    def reset(self, *, seed=None, options=None) -> tuple: ...
    def render(self) -> np.ndarray: ...

Import

from mani_skill.utils.wrappers.gymnasium import CPUGymWrapper

I/O Contract

Inputs

Name	Type	Required	Description
env	gym.Env	Yes	A ManiSkill environment (num_envs=1, CPU backend)
ignore_terminations	bool	No	Suppress termination signals (default: False)
record_metrics	bool	No	Track episode metrics in info (default: False)

Outputs

Name	Type	Description
obs	np.ndarray or dict	Unbatched numpy observation
reward	float	Scalar reward
terminated	bool	Whether episode terminated
truncated	bool	Whether episode was truncated
info	dict	Info dict (with optional "episode" metrics)

Usage Examples

Basic Usage

import gymnasium as gym
from mani_skill.utils.wrappers.gymnasium import CPUGymWrapper

env = gym.make("PickCube-v1", num_envs=1)
env = CPUGymWrapper(env, record_metrics=True)

obs, info = env.reset()
# obs is a numpy array (not batched, not torch)

action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
# reward is a float, terminated/truncated are bools
# info["episode"]["return"] is the cumulative return

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment