Implementation:Haosulab ManiSkill Initialize Episode Pattern

Field	Value
Page Type	Implementation (Pattern Doc)
Title	ManiSkill Initialize Episode Pattern
Domain	Simulation, Robotics, Environment_Design, Reinforcement_Learning
Related Principle	Principle:Haosulab_ManiSkill_Episode_Initialization
Source Files	`mani_skill/envs/sapien_env.py` (L1018-1021), `mani_skill/envs/utils/randomization/samplers.py` (L13-96)
Date	2026-02-15
Repository	Haosulab/ManiSkill

Overview

Description

This pattern describes how to implement _initialize_episode() for a custom ManiSkill task and how to use the UniformPlacementSampler for collision-free random object placement. The _initialize_episode() method is a hook called by BaseEnv.reset() to set up per-episode randomized initial conditions. The UniformPlacementSampler is a batched rejection sampler that generates positions within bounds while enforcing minimum distance constraints between objects.

Usage

Override _initialize_episode() in your BaseEnv subclass. Use UniformPlacementSampler when placing multiple objects that must not overlap.

from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler

Code Reference

_initialize_episode Interface (sapien_env.py L1018-1021)

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    """Initialize the episode, e.g., poses of actors and articulations,
    as well as task relevant data like randomizing goal positions.

    Args:
        env_idx (torch.Tensor): Indices of environments being reset.
            For partial reset, this may be a subset of all environments.
            Use len(env_idx) as the batch size.
        options (dict): Options dict from env.reset(). May contain
            task-specific configuration keys.
    """

UniformPlacementSampler (samplers.py L13-96)

class UniformPlacementSampler:
    """Uniform placement sampler that lets you sequentially sample data
    such that the data is within given bounds and not too close to
    previously sampled data. Batched for GPU-simulated tasks.

    Args:
        bounds: ((low1, low2, ...), (high1, high2, ...))
            The bounding box for sampling.
        batch_size (int): Number of positions to sample per call
            (typically len(env_idx)).
        device: torch device for tensor operations.
    """

    def __init__(
        self,
        bounds: Tuple[list[float], list[float]],
        batch_size: int,
        device=None,
    ) -> None:
        ...

    def sample(
        self,
        radius: float,
        max_trials: int,
        append: bool = True,
        verbose: bool = False,
    ) -> torch.Tensor:
        """Sample a batch of positions with collision avoidance.

        Args:
            radius (float): Collision radius for the new object.
            max_trials (int): Maximum rejection sampling iterations.
            append (bool): Whether to track this sample for future
                collision checks. Default True.
            verbose (bool): Print sampling diagnostics. Default False.

        Returns:
            torch.Tensor: Sampled positions of shape (batch_size, dim).
        """
        ...

I/O Contract

_initialize_episode

Parameter	Type	Description
`env_idx`	`torch.Tensor`	Integer tensor of environment indices being reset. Shape `(B,)` where `B <= num_envs`.
`options`	`dict`	Options dictionary from `env.reset()`. May be empty.

Returns: None. State is modified in-place by calling actor.set_pose(), articulation.set_qpos(), etc.

Context: When this method executes:

The scene has been fully built (all actors and articulations exist).
Simulation state (velocities) has been cleared.
The scene's internal reset mask is set so that set_pose() and similar calls only affect environments in env_idx.
The torch RNG is seeded via torch.random.fork_rng() for reproducibility.

UniformPlacementSampler.sample()

Parameter	Type	Description
`radius`	`float`	Collision radius for the object being placed
`max_trials`	`int`	Maximum number of rejection sampling attempts
`append`	`bool`	If `True`, the sampled position is stored for future collision checks
`verbose`	`bool`	If `True`, prints diagnostic messages

Returns: torch.Tensor of shape (batch_size, dim) containing sampled positions.

Note: If max_trials is exhausted without finding valid placements for all environments, the sampler returns the best available positions (some environments may have invalid placements).

Usage Examples

Basic Episode Initialization (PushCube Pattern)

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)

        # Initialize the pre-built table scene (sets robot poses)
        self.table_scene.initialize(env_idx)

        # Randomize cube position on the table surface
        xyz = torch.zeros((b, 3))
        xyz[..., :2] = torch.rand((b, 2)) * 0.2 - 0.1  # [-0.1, 0.1] range
        xyz[..., 2] = self.cube_half_size               # sitting on table

        obj_pose = Pose.create_from_pq(p=xyz, q=[1, 0, 0, 0])
        self.obj.set_pose(obj_pose)

        # Set goal position relative to the cube
        target_xyz = xyz + torch.tensor([0.2, 0, 0])
        target_xyz[..., 2] = 1e-3
        self.goal_region.set_pose(
            Pose.create_from_pq(p=target_xyz, q=euler2quat(0, np.pi / 2, 0))
        )

Collision-Free Multi-Object Placement

from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)
        self.table_scene.initialize(env_idx)

        # Create a sampler for the table surface region
        sampler = UniformPlacementSampler(
            bounds=[[-0.2, -0.2], [0.2, 0.2]],
            batch_size=b,
            device=self.device,
        )

        # Sample positions for object A (radius=0.03)
        pos_a = sampler.sample(radius=0.03, max_trials=100)
        xyz_a = torch.zeros((b, 3))
        xyz_a[..., :2] = pos_a
        xyz_a[..., 2] = 0.03
        self.obj_a.set_pose(Pose.create_from_pq(p=xyz_a, q=[1, 0, 0, 0]))

        # Sample positions for object B, avoiding object A
        pos_b = sampler.sample(radius=0.04, max_trials=100)
        xyz_b = torch.zeros((b, 3))
        xyz_b[..., :2] = pos_b
        xyz_b[..., 2] = 0.04
        self.obj_b.set_pose(Pose.create_from_pq(p=xyz_b, q=[1, 0, 0, 0]))

Randomizing Orientations

from transforms3d.euler import euler2quat

def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
    with torch.device(self.device):
        b = len(env_idx)
        self.table_scene.initialize(env_idx)

        # Random XY position
        xyz = torch.zeros((b, 3))
        xyz[..., :2] = torch.rand((b, 2)) * 0.3 - 0.15
        xyz[..., 2] = 0.05

        # Random Z-axis rotation
        random_angles = torch.rand((b,)) * 2 * np.pi
        quats = torch.zeros((b, 4))
        for i in range(b):
            quats[i] = torch.tensor(euler2quat(0, 0, random_angles[i].item()))

        self.obj.set_pose(Pose.create_from_pq(p=xyz, q=quats))

Related Pages

Principle:Haosulab_ManiSkill_Episode_Initialization -- The principle this implements
Implementation:Haosulab_ManiSkill_ActorBuilder_TableSceneBuilder -- Building the scene before initialization
Implementation:Haosulab_ManiSkill_Evaluate_Dense_Reward -- Reward computation after initialization
Implementation:Haosulab_ManiSkill_Get_Obs_Extra_CameraConfig -- Observations generated after initialization
Heuristic:Haosulab_ManiSkill_Initial_Pose_Performance

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment