Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Haosulab ManiSkill ManiSkillVectorEnv

From Leeroopedia
Field Value
implementation_name Haosulab_ManiSkill_ManiSkillVectorEnv
overview Concrete tool for wrapping ManiSkill environments as vectorized Gymnasium environments with auto-reset, metric tracking, and action space flattening
type Library API
domains Simulation, Reinforcement_Learning
last_updated 2026-02-15
related_pages Principle:Haosulab_ManiSkill_Vectorized_Environment_Wrapping

Overview

Description

ManiSkillVectorEnv is a Gymnasium VectorEnv implementation specifically designed for ManiSkill GPU-parallel environments. It wraps a BaseEnv (or any wrapped version of it) and provides automatic episode resetting, optional termination ignoring for infinite-horizon formulations, and episode metric recording (return, length, success rate).

Additionally, FlattenActionSpaceWrapper is a companion wrapper that converts Dict action spaces into flat Box action spaces, which is required by most RL algorithm implementations that expect continuous vector actions.

Usage

Apply these wrappers after creating the base environment with gym.make() and before passing to the RL algorithm. FlattenActionSpaceWrapper is applied first (if needed), then ManiSkillVectorEnv is applied as the outermost wrapper.

Code Reference

Field Value
Repository https://github.com/haosulab/ManiSkill
File (ManiSkillVectorEnv) mani_skill/vector/wrappers/gymnasium.py (lines 18-200)
File (FlattenActionSpaceWrapper) mani_skill/utils/wrappers/flatten.py (lines 98-136)

ManiSkillVectorEnv Signature:

class ManiSkillVectorEnv(VectorEnv):
    """
    Gymnasium Vector Env implementation for ManiSkill environments running on
    the GPU for parallel simulation and optionally parallel rendering.
    """

    def __init__(
        self,
        env: Union[Env, str],
        num_envs: int = 1,
        auto_reset: bool = True,
        ignore_terminations: bool = False,
        record_metrics: bool = False,
        **kwargs,
    ):
        ...

    def reset(
        self,
        *,
        seed: Optional[Union[int, list[int]]] = None,
        options: Optional[dict] = None,
    ) -> Tuple[Array, dict]:
        ...

    def step(
        self, actions: Union[Array, dict]
    ) -> Tuple[Array, Array, Array, Array, dict]:
        ...

FlattenActionSpaceWrapper Signature:

class FlattenActionSpaceWrapper(gym.ActionWrapper):
    """Flattens the action space. The original action space must be spaces.Dict."""

    def __init__(self, env) -> None:
        super().__init__(env)
        self._orig_single_action_space = copy.deepcopy(self.base_env.single_action_space)
        self.single_action_space = gymnasium.spaces.utils.flatten_space(
            self.base_env.single_action_space
        )
        if self.base_env.num_envs > 1:
            self.action_space = batch_space(self.single_action_space, n=self.base_env.num_envs)
        else:
            self.action_space = self.single_action_space

    def action(self, action):
        # Unflattens the continuous vector back into a Dict for the inner env
        unflattened_action = dict()
        start, end = 0, 0
        for k, space in self._orig_single_action_space.items():
            end += space.shape[0]
            unflattened_action[k] = action[:, start:end]
            start += space.shape[0]
        return unflattened_action

Imports:

from mani_skill.vector.wrappers.gymnasium import ManiSkillVectorEnv
from mani_skill.utils.wrappers.flatten import FlattenActionSpaceWrapper
from mani_skill.utils.wrappers.record import RecordEpisode

I/O Contract

ManiSkillVectorEnv:

Direction Name Type Description
Input env Union[Env, str] A ManiSkill environment (from gym.make) or a string environment ID
Input num_envs int Number of parallel environments (only used when env is a string)
Input auto_reset bool Whether to auto-reset environments on episode completion (default: True)
Input ignore_terminations bool If True, overrides termination signals to False for infinite-horizon training (default: False)
Input record_metrics bool If True, tracks return, episode length, success_once, fail_once in info dicts (default: False)
Output (from step) Tuple[obs, rew, terminated, truncated, info] Standard Gymnasium VectorEnv step return with auto-reset semantics

Auto-reset info structure (when an episode ends):

Key Type Description
info["final_observation"] Tensor The true last observation of the completed episode (before reset)
info["final_info"] dict The info dict from the final step of the completed episode
info["_final_info"] Tensor[bool] Boolean mask indicating which environments completed an episode
info["_final_observation"] Tensor[bool] Same mask as _final_info

FlattenActionSpaceWrapper:

Direction Name Type Description
Input env Env Environment with a Dict action space
Output wrapped env Env Environment with a flattened Box action space
Transform action (in step) Tensor (num_envs, flat_dim) -> Dict[str, Tensor] Unflattens the flat action vector back into a Dict

Usage Examples

Example 1: Standard PPO training wrapper pipeline

import gymnasium as gym
import mani_skill.envs
from mani_skill.utils.wrappers.flatten import FlattenActionSpaceWrapper
from mani_skill.vector.wrappers.gymnasium import ManiSkillVectorEnv

# Step 1: Create base environment
envs = gym.make(
    "PickCube-v1",
    num_envs=512,
    obs_mode="state",
    control_mode="pd_joint_delta_pos",
    render_mode="rgb_array",
    sim_backend="physx_cuda",
)

# Step 2: Flatten Dict action space if needed
if isinstance(envs.action_space, gym.spaces.Dict):
    envs = FlattenActionSpaceWrapper(envs)

# Step 3: Wrap as vectorized env with auto-reset and metrics
envs = ManiSkillVectorEnv(
    envs,
    num_envs=512,
    ignore_terminations=False,  # allow partial resets on success/failure
    record_metrics=True,
)

# Now envs has a flat Box action space and auto-resets
assert isinstance(envs.single_action_space, gym.spaces.Box)

Example 2: Evaluation environment with recording

from mani_skill.utils.wrappers.record import RecordEpisode

eval_envs = gym.make("PickCube-v1", num_envs=8, obs_mode="state",
                      render_mode="rgb_array", sim_backend="physx_cuda",
                      reconfiguration_freq=1)

if isinstance(eval_envs.action_space, gym.spaces.Dict):
    eval_envs = FlattenActionSpaceWrapper(eval_envs)

# Add video recording
eval_envs = RecordEpisode(
    eval_envs,
    output_dir="runs/eval_videos",
    save_trajectory=False,
    max_steps_per_video=50,
    video_fps=30,
)

# Wrap for vectorized interface
eval_envs = ManiSkillVectorEnv(
    eval_envs,
    num_envs=8,
    ignore_terminations=False,
    record_metrics=True,
)

Example 3: Infinite-horizon training (ignore terminations)

envs = gym.make("PegInsertionSide-v1", num_envs=256, obs_mode="state",
                render_mode="rgb_array", sim_backend="physx_cuda")

envs = ManiSkillVectorEnv(
    envs,
    num_envs=256,
    ignore_terminations=True,   # episodes only end at time limit
    record_metrics=True,        # still track success_at_end metrics
)
# terminated will always be False; only truncated triggers auto-reset

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment