Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Haosulab ManiSkill Initialize Episode Pattern

From Leeroopedia
Field Value
Page Type Implementation (Pattern Doc)
Title ManiSkill Initialize Episode Pattern
Domain Simulation, Robotics, Environment_Design, Reinforcement_Learning
Related Principle Principle:Haosulab_ManiSkill_Episode_Initialization
Source Files mani_skill/envs/sapien_env.py (L1018-1021), mani_skill/envs/utils/randomization/samplers.py (L13-96)
Date 2026-02-15
Repository Haosulab/ManiSkill

Overview

Description

This pattern describes how to implement _initialize_episode() for a custom ManiSkill task and how to use the UniformPlacementSampler for collision-free random object placement. The _initialize_episode() method is a hook called by BaseEnv.reset() to set up per-episode randomized initial conditions. The UniformPlacementSampler is a batched rejection sampler that generates positions within bounds while enforcing minimum distance constraints between objects.

Usage

Override _initialize_episode() in your BaseEnv subclass. Use UniformPlacementSampler when placing multiple objects that must not overlap.

from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler

Code Reference

_initialize_episode Interface (sapien_env.py L1018-1021)

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    """Initialize the episode, e.g., poses of actors and articulations,
    as well as task relevant data like randomizing goal positions.

    Args:
        env_idx (torch.Tensor): Indices of environments being reset.
            For partial reset, this may be a subset of all environments.
            Use len(env_idx) as the batch size.
        options (dict): Options dict from env.reset(). May contain
            task-specific configuration keys.
    """

UniformPlacementSampler (samplers.py L13-96)

class UniformPlacementSampler:
    """Uniform placement sampler that lets you sequentially sample data
    such that the data is within given bounds and not too close to
    previously sampled data. Batched for GPU-simulated tasks.

    Args:
        bounds: ((low1, low2, ...), (high1, high2, ...))
            The bounding box for sampling.
        batch_size (int): Number of positions to sample per call
            (typically len(env_idx)).
        device: torch device for tensor operations.
    """

    def __init__(
        self,
        bounds: Tuple[list[float], list[float]],
        batch_size: int,
        device=None,
    ) -> None:
        ...

    def sample(
        self,
        radius: float,
        max_trials: int,
        append: bool = True,
        verbose: bool = False,
    ) -> torch.Tensor:
        """Sample a batch of positions with collision avoidance.

        Args:
            radius (float): Collision radius for the new object.
            max_trials (int): Maximum rejection sampling iterations.
            append (bool): Whether to track this sample for future
                collision checks. Default True.
            verbose (bool): Print sampling diagnostics. Default False.

        Returns:
            torch.Tensor: Sampled positions of shape (batch_size, dim).
        """
        ...

I/O Contract

_initialize_episode

Parameter Type Description
env_idx torch.Tensor Integer tensor of environment indices being reset. Shape (B,) where B <= num_envs.
options dict Options dictionary from env.reset(). May be empty.

Returns: None. State is modified in-place by calling actor.set_pose(), articulation.set_qpos(), etc.

Context: When this method executes:

  • The scene has been fully built (all actors and articulations exist).
  • Simulation state (velocities) has been cleared.
  • The scene's internal reset mask is set so that set_pose() and similar calls only affect environments in env_idx.
  • The torch RNG is seeded via torch.random.fork_rng() for reproducibility.

UniformPlacementSampler.sample()

Parameter Type Description
radius float Collision radius for the object being placed
max_trials int Maximum number of rejection sampling attempts
append bool If True, the sampled position is stored for future collision checks
verbose bool If True, prints diagnostic messages

Returns: torch.Tensor of shape (batch_size, dim) containing sampled positions.

Note: If max_trials is exhausted without finding valid placements for all environments, the sampler returns the best available positions (some environments may have invalid placements).

Usage Examples

Basic Episode Initialization (PushCube Pattern)

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)

        # Initialize the pre-built table scene (sets robot poses)
        self.table_scene.initialize(env_idx)

        # Randomize cube position on the table surface
        xyz = torch.zeros((b, 3))
        xyz[..., :2] = torch.rand((b, 2)) * 0.2 - 0.1  # [-0.1, 0.1] range
        xyz[..., 2] = self.cube_half_size               # sitting on table

        obj_pose = Pose.create_from_pq(p=xyz, q=[1, 0, 0, 0])
        self.obj.set_pose(obj_pose)

        # Set goal position relative to the cube
        target_xyz = xyz + torch.tensor([0.2, 0, 0])
        target_xyz[..., 2] = 1e-3
        self.goal_region.set_pose(
            Pose.create_from_pq(p=target_xyz, q=euler2quat(0, np.pi / 2, 0))
        )

Collision-Free Multi-Object Placement

from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)
        self.table_scene.initialize(env_idx)

        # Create a sampler for the table surface region
        sampler = UniformPlacementSampler(
            bounds=[[-0.2, -0.2], [0.2, 0.2]],
            batch_size=b,
            device=self.device,
        )

        # Sample positions for object A (radius=0.03)
        pos_a = sampler.sample(radius=0.03, max_trials=100)
        xyz_a = torch.zeros((b, 3))
        xyz_a[..., :2] = pos_a
        xyz_a[..., 2] = 0.03
        self.obj_a.set_pose(Pose.create_from_pq(p=xyz_a, q=[1, 0, 0, 0]))

        # Sample positions for object B, avoiding object A
        pos_b = sampler.sample(radius=0.04, max_trials=100)
        xyz_b = torch.zeros((b, 3))
        xyz_b[..., :2] = pos_b
        xyz_b[..., 2] = 0.04
        self.obj_b.set_pose(Pose.create_from_pq(p=xyz_b, q=[1, 0, 0, 0]))

Randomizing Orientations

from transforms3d.euler import euler2quat

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)
        self.table_scene.initialize(env_idx)

        # Random XY position
        xyz = torch.zeros((b, 3))
        xyz[..., :2] = torch.rand((b, 2)) * 0.3 - 0.15
        xyz[..., 2] = 0.05

        # Random Z-axis rotation
        random_angles = torch.rand((b,)) * 2 * np.pi
        quats = torch.zeros((b, 4))
        for i in range(b):
            quats[i] = torch.tensor(euler2quat(0, 0, random_angles[i].item()))

        self.obj.set_pose(Pose.create_from_pq(p=xyz, q=quats))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment