Implementation:Haosulab ManiSkill Initialize Episode Pattern
| Field | Value |
|---|---|
| Page Type | Implementation (Pattern Doc) |
| Title | ManiSkill Initialize Episode Pattern |
| Domain | Simulation, Robotics, Environment_Design, Reinforcement_Learning |
| Related Principle | Principle:Haosulab_ManiSkill_Episode_Initialization |
| Source Files | mani_skill/envs/sapien_env.py (L1018-1021), mani_skill/envs/utils/randomization/samplers.py (L13-96)
|
| Date | 2026-02-15 |
| Repository | Haosulab/ManiSkill |
Overview
Description
This pattern describes how to implement _initialize_episode() for a custom ManiSkill task and how to use the UniformPlacementSampler for collision-free random object placement. The _initialize_episode() method is a hook called by BaseEnv.reset() to set up per-episode randomized initial conditions. The UniformPlacementSampler is a batched rejection sampler that generates positions within bounds while enforcing minimum distance constraints between objects.
Usage
Override _initialize_episode() in your BaseEnv subclass. Use UniformPlacementSampler when placing multiple objects that must not overlap.
from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler
Code Reference
_initialize_episode Interface (sapien_env.py L1018-1021)
def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
"""Initialize the episode, e.g., poses of actors and articulations,
as well as task relevant data like randomizing goal positions.
Args:
env_idx (torch.Tensor): Indices of environments being reset.
For partial reset, this may be a subset of all environments.
Use len(env_idx) as the batch size.
options (dict): Options dict from env.reset(). May contain
task-specific configuration keys.
"""
UniformPlacementSampler (samplers.py L13-96)
class UniformPlacementSampler:
"""Uniform placement sampler that lets you sequentially sample data
such that the data is within given bounds and not too close to
previously sampled data. Batched for GPU-simulated tasks.
Args:
bounds: ((low1, low2, ...), (high1, high2, ...))
The bounding box for sampling.
batch_size (int): Number of positions to sample per call
(typically len(env_idx)).
device: torch device for tensor operations.
"""
def __init__(
self,
bounds: Tuple[list[float], list[float]],
batch_size: int,
device=None,
) -> None:
...
def sample(
self,
radius: float,
max_trials: int,
append: bool = True,
verbose: bool = False,
) -> torch.Tensor:
"""Sample a batch of positions with collision avoidance.
Args:
radius (float): Collision radius for the new object.
max_trials (int): Maximum rejection sampling iterations.
append (bool): Whether to track this sample for future
collision checks. Default True.
verbose (bool): Print sampling diagnostics. Default False.
Returns:
torch.Tensor: Sampled positions of shape (batch_size, dim).
"""
...
I/O Contract
_initialize_episode
| Parameter | Type | Description |
|---|---|---|
env_idx |
torch.Tensor |
Integer tensor of environment indices being reset. Shape (B,) where B <= num_envs.
|
options |
dict |
Options dictionary from env.reset(). May be empty.
|
Returns: None. State is modified in-place by calling actor.set_pose(), articulation.set_qpos(), etc.
Context: When this method executes:
- The scene has been fully built (all actors and articulations exist).
- Simulation state (velocities) has been cleared.
- The scene's internal reset mask is set so that
set_pose()and similar calls only affect environments inenv_idx. - The torch RNG is seeded via
torch.random.fork_rng()for reproducibility.
UniformPlacementSampler.sample()
| Parameter | Type | Description |
|---|---|---|
radius |
float |
Collision radius for the object being placed |
max_trials |
int |
Maximum number of rejection sampling attempts |
append |
bool |
If True, the sampled position is stored for future collision checks
|
verbose |
bool |
If True, prints diagnostic messages
|
Returns: torch.Tensor of shape (batch_size, dim) containing sampled positions.
Note: If max_trials is exhausted without finding valid placements for all environments, the sampler returns the best available positions (some environments may have invalid placements).
Usage Examples
Basic Episode Initialization (PushCube Pattern)
def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
with torch.device(self.device):
b = len(env_idx)
# Initialize the pre-built table scene (sets robot poses)
self.table_scene.initialize(env_idx)
# Randomize cube position on the table surface
xyz = torch.zeros((b, 3))
xyz[..., :2] = torch.rand((b, 2)) * 0.2 - 0.1 # [-0.1, 0.1] range
xyz[..., 2] = self.cube_half_size # sitting on table
obj_pose = Pose.create_from_pq(p=xyz, q=[1, 0, 0, 0])
self.obj.set_pose(obj_pose)
# Set goal position relative to the cube
target_xyz = xyz + torch.tensor([0.2, 0, 0])
target_xyz[..., 2] = 1e-3
self.goal_region.set_pose(
Pose.create_from_pq(p=target_xyz, q=euler2quat(0, np.pi / 2, 0))
)
Collision-Free Multi-Object Placement
from mani_skill.envs.utils.randomization.samplers import UniformPlacementSampler
def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
with torch.device(self.device):
b = len(env_idx)
self.table_scene.initialize(env_idx)
# Create a sampler for the table surface region
sampler = UniformPlacementSampler(
bounds=[[-0.2, -0.2], [0.2, 0.2]],
batch_size=b,
device=self.device,
)
# Sample positions for object A (radius=0.03)
pos_a = sampler.sample(radius=0.03, max_trials=100)
xyz_a = torch.zeros((b, 3))
xyz_a[..., :2] = pos_a
xyz_a[..., 2] = 0.03
self.obj_a.set_pose(Pose.create_from_pq(p=xyz_a, q=[1, 0, 0, 0]))
# Sample positions for object B, avoiding object A
pos_b = sampler.sample(radius=0.04, max_trials=100)
xyz_b = torch.zeros((b, 3))
xyz_b[..., :2] = pos_b
xyz_b[..., 2] = 0.04
self.obj_b.set_pose(Pose.create_from_pq(p=xyz_b, q=[1, 0, 0, 0]))
Randomizing Orientations
from transforms3d.euler import euler2quat
def _initialize_episode(self, env_idx: torch.Tensor, options: dict):
with torch.device(self.device):
b = len(env_idx)
self.table_scene.initialize(env_idx)
# Random XY position
xyz = torch.zeros((b, 3))
xyz[..., :2] = torch.rand((b, 2)) * 0.3 - 0.15
xyz[..., 2] = 0.05
# Random Z-axis rotation
random_angles = torch.rand((b,)) * 2 * np.pi
quats = torch.zeros((b, 4))
for i in range(b):
quats[i] = torch.tensor(euler2quat(0, 0, random_angles[i].item()))
self.obj.set_pose(Pose.create_from_pq(p=xyz, q=quats))
Related Pages
- Principle:Haosulab_ManiSkill_Episode_Initialization -- The principle this implements
- Implementation:Haosulab_ManiSkill_ActorBuilder_TableSceneBuilder -- Building the scene before initialization
- Implementation:Haosulab_ManiSkill_Evaluate_Dense_Reward -- Reward computation after initialization
- Implementation:Haosulab_ManiSkill_Get_Obs_Extra_CameraConfig -- Observations generated after initialization
- Heuristic:Haosulab_ManiSkill_Initial_Pose_Performance