Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Haosulab ManiSkill CPUGymWrapper

From Leeroopedia
Knowledge Sources
Domains Robotics, Simulation, Gymnasium Integration
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for converting ManiSkill's batched torch tensor outputs to unbatched numpy arrays conforming to the standard Gymnasium API.

Description

The CPUGymWrapper is a Gymnasium wrapper that ensures ManiSkill environment outputs (from step(), reset(), and render()) fully conform to the standard Gymnasium API by converting batched torch tensors to unbatched numpy arrays.

Key behavior:

  • step(): Converts action to numpy, runs the underlying step, converts outputs to numpy, and unbatches (removes the leading batch dimension of size 1).
  • reset(): Runs reset, converts to numpy, and unbatches.
  • render(): Returns unbatched numpy images for rgb_array/sensors/all render modes.
  • Sets observation_space and action_space to the single (unbatched) versions.

Assertions: Only works with single-environment setups (num_envs == 1) on the CPU backend.

Optional metric recording (when record_metrics=True):

  • Tracks cumulative return, episode length, average reward per step.
  • Tracks success_once and fail_once flags (set once and kept True).
  • When ignore_terminations=True, also records success_at_end and fail_at_end.
  • Metrics are stored in info["episode"].

ignore_terminations: When True, the wrapper suppresses termination signals, allowing the episode to continue until truncation. Useful for evaluating success rates at the end of fixed-length episodes.

Usage

Apply as the final wrapper when using ManiSkill with standard RL frameworks that expect Gymnasium-compliant numpy outputs. Should be applied after all other ManiSkill-specific wrappers.

Code Reference

Source Location

Signature

class CPUGymWrapper(gym.Wrapper):
    def __init__(
        self,
        env: gym.Env,
        ignore_terminations: bool = False,
        record_metrics: bool = False,
    ): ...
    def step(self, action) -> tuple: ...
    def reset(self, *, seed=None, options=None) -> tuple: ...
    def render(self) -> np.ndarray: ...

Import

from mani_skill.utils.wrappers.gymnasium import CPUGymWrapper

I/O Contract

Inputs

Name Type Required Description
env gym.Env Yes A ManiSkill environment (num_envs=1, CPU backend)
ignore_terminations bool No Suppress termination signals (default: False)
record_metrics bool No Track episode metrics in info (default: False)

Outputs

Name Type Description
obs np.ndarray or dict Unbatched numpy observation
reward float Scalar reward
terminated bool Whether episode terminated
truncated bool Whether episode was truncated
info dict Info dict (with optional "episode" metrics)

Usage Examples

Basic Usage

import gymnasium as gym
from mani_skill.utils.wrappers.gymnasium import CPUGymWrapper

env = gym.make("PickCube-v1", num_envs=1)
env = CPUGymWrapper(env, record_metrics=True)

obs, info = env.reset()
# obs is a numpy array (not batched, not torch)

action = env.action_space.sample()
obs, reward, terminated, truncated, info = env.step(action)
# reward is a float, terminated/truncated are bools
# info["episode"]["return"] is the cumulative return

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment