Implementation:Haosulab ManiSkill ManiSkillVectorEnv

Field	Value
implementation_name	Haosulab_ManiSkill_ManiSkillVectorEnv
overview	Concrete tool for wrapping ManiSkill environments as vectorized Gymnasium environments with auto-reset, metric tracking, and action space flattening
type	Library API
domains	Simulation, Reinforcement_Learning
last_updated	2026-02-15
related_pages	Principle:Haosulab_ManiSkill_Vectorized_Environment_Wrapping

Overview

Description

ManiSkillVectorEnv is a Gymnasium VectorEnv implementation specifically designed for ManiSkill GPU-parallel environments. It wraps a BaseEnv (or any wrapped version of it) and provides automatic episode resetting, optional termination ignoring for infinite-horizon formulations, and episode metric recording (return, length, success rate).

Additionally, FlattenActionSpaceWrapper is a companion wrapper that converts Dict action spaces into flat Box action spaces, which is required by most RL algorithm implementations that expect continuous vector actions.

Usage

Apply these wrappers after creating the base environment with gym.make() and before passing to the RL algorithm. FlattenActionSpaceWrapper is applied first (if needed), then ManiSkillVectorEnv is applied as the outermost wrapper.

Code Reference

Field	Value
Repository	https://github.com/haosulab/ManiSkill
File (ManiSkillVectorEnv)	`mani_skill/vector/wrappers/gymnasium.py` (lines 18-200)
File (FlattenActionSpaceWrapper)	`mani_skill/utils/wrappers/flatten.py` (lines 98-136)

ManiSkillVectorEnv Signature:

class ManiSkillVectorEnv(VectorEnv):
    """
    Gymnasium Vector Env implementation for ManiSkill environments running on
    the GPU for parallel simulation and optionally parallel rendering.
    """

    def __init__(
        self,
        env: Union[Env, str],
        num_envs: int = 1,
        auto_reset: bool = True,
        ignore_terminations: bool = False,
        record_metrics: bool = False,
        **kwargs,
    ):
        ...

    def reset(
        self,
        *,
        seed: Optional[Union[int, list[int]]] = None,
        options: Optional[dict] = None,
    ) -> Tuple[Array, dict]:
        ...

    def step(
        self, actions: Union[Array, dict]
    ) -> Tuple[Array, Array, Array, Array, dict]:
        ...

FlattenActionSpaceWrapper Signature:

class FlattenActionSpaceWrapper(gym.ActionWrapper):
    """Flattens the action space. The original action space must be spaces.Dict."""

    def __init__(self, env) -> None:
        super().__init__(env)
        self._orig_single_action_space = copy.deepcopy(self.base_env.single_action_space)
        self.single_action_space = gymnasium.spaces.utils.flatten_space(
            self.base_env.single_action_space
        )
        if self.base_env.num_envs > 1:
            self.action_space = batch_space(self.single_action_space, n=self.base_env.num_envs)
        else:
            self.action_space = self.single_action_space

    def action(self, action):
        # Unflattens the continuous vector back into a Dict for the inner env
        unflattened_action = dict()
        start, end = 0, 0
        for k, space in self._orig_single_action_space.items():
            end += space.shape[0]
            unflattened_action[k] = action[:, start:end]
            start += space.shape[0]
        return unflattened_action

Imports:

from mani_skill.vector.wrappers.gymnasium import ManiSkillVectorEnv
from mani_skill.utils.wrappers.flatten import FlattenActionSpaceWrapper
from mani_skill.utils.wrappers.record import RecordEpisode

I/O Contract

ManiSkillVectorEnv:

Direction	Name	Type	Description
Input	env	`Union[Env, str]`	A ManiSkill environment (from `gym.make`) or a string environment ID
Input	num_envs	`int`	Number of parallel environments (only used when `env` is a string)
Input	auto_reset	`bool`	Whether to auto-reset environments on episode completion (default: `True`)
Input	ignore_terminations	`bool`	If `True`, overrides termination signals to `False` for infinite-horizon training (default: `False`)
Input	record_metrics	`bool`	If `True`, tracks return, episode length, success_once, fail_once in info dicts (default: `False`)
Output	(from step)	`Tuple[obs, rew, terminated, truncated, info]`	Standard Gymnasium VectorEnv step return with auto-reset semantics

Auto-reset info structure (when an episode ends):

Key	Type	Description
`info["final_observation"]`	`Tensor`	The true last observation of the completed episode (before reset)
`info["final_info"]`	`dict`	The info dict from the final step of the completed episode
`info["_final_info"]`	`Tensor[bool]`	Boolean mask indicating which environments completed an episode
`info["_final_observation"]`	`Tensor[bool]`	Same mask as `_final_info`

FlattenActionSpaceWrapper:

Direction	Name	Type	Description
Input	env	`Env`	Environment with a `Dict` action space
Output	wrapped env	`Env`	Environment with a flattened `Box` action space
Transform	action (in step)	`Tensor (num_envs, flat_dim)` -> `Dict[str, Tensor]`	Unflattens the flat action vector back into a Dict

Usage Examples

Example 1: Standard PPO training wrapper pipeline

import gymnasium as gym
import mani_skill.envs
from mani_skill.utils.wrappers.flatten import FlattenActionSpaceWrapper
from mani_skill.vector.wrappers.gymnasium import ManiSkillVectorEnv

# Step 1: Create base environment
envs = gym.make(
    "PickCube-v1",
    num_envs=512,
    obs_mode="state",
    control_mode="pd_joint_delta_pos",
    render_mode="rgb_array",
    sim_backend="physx_cuda",
)

# Step 2: Flatten Dict action space if needed
if isinstance(envs.action_space, gym.spaces.Dict):
    envs = FlattenActionSpaceWrapper(envs)

# Step 3: Wrap as vectorized env with auto-reset and metrics
envs = ManiSkillVectorEnv(
    envs,
    num_envs=512,
    ignore_terminations=False,  # allow partial resets on success/failure
    record_metrics=True,
)

# Now envs has a flat Box action space and auto-resets
assert isinstance(envs.single_action_space, gym.spaces.Box)

Example 2: Evaluation environment with recording

from mani_skill.utils.wrappers.record import RecordEpisode

eval_envs = gym.make("PickCube-v1", num_envs=8, obs_mode="state",
                      render_mode="rgb_array", sim_backend="physx_cuda",
                      reconfiguration_freq=1)

if isinstance(eval_envs.action_space, gym.spaces.Dict):
    eval_envs = FlattenActionSpaceWrapper(eval_envs)

# Add video recording
eval_envs = RecordEpisode(
    eval_envs,
    output_dir="runs/eval_videos",
    save_trajectory=False,
    max_steps_per_video=50,
    video_fps=30,
)

# Wrap for vectorized interface
eval_envs = ManiSkillVectorEnv(
    eval_envs,
    num_envs=8,
    ignore_terminations=False,
    record_metrics=True,
)

Example 3: Infinite-horizon training (ignore terminations)

envs = gym.make("PegInsertionSide-v1", num_envs=256, obs_mode="state",
                render_mode="rgb_array", sim_backend="physx_cuda")

envs = ManiSkillVectorEnv(
    envs,
    num_envs=256,
    ignore_terminations=True,   # episodes only end at time limit
    record_metrics=True,        # still track success_at_end metrics
)
# terminated will always be False; only truncated triggers auto-reset

Related Pages

Principle:Haosulab_ManiSkill_Vectorized_Environment_Wrapping -- The principle this implementation realizes
Implementation:Haosulab_ManiSkill_Gym_Make_BaseEnv -- Creating the base environment that gets wrapped
Implementation:Haosulab_ManiSkill_BaseEnv_Step_Reset -- The underlying step/reset methods called by this wrapper
Environment:Haosulab_ManiSkill_Python_SAPIEN_Core
Environment:Haosulab_ManiSkill_GPU_CUDA_Simulation
Heuristic:Haosulab_ManiSkill_Num_Envs_Backend_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment