Principle:Haosulab ManiSkill Scene Object Loading
| Field | Value |
|---|---|
| Page Type | Principle |
| Title | ManiSkill Scene and Object Loading |
| Domain | Simulation, Robotics, Environment_Design, Physics_Simulation |
| Related Implementation | Implementation:Haosulab_ManiSkill_ActorBuilder_TableSceneBuilder |
| Date | 2026-02-15 |
| Repository | Haosulab/ManiSkill |
Overview
Description
Building a simulation scene in ManiSkill involves constructing a collection of physics objects -- actors (rigid bodies), articulations (jointed multi-body objects like robots or drawers), and scene templates (pre-built compositions of common furniture and fixtures). The task developer populates the scene by overriding the _load_scene() method of BaseEnv, which is called during environment reconfiguration.
ManiSkill provides two complementary approaches for loading objects:
- Builder pattern (ActorBuilder): A fluent API inherited from SAPIEN that lets you programmatically construct actors by chaining calls to add collision shapes, visual shapes, and physics properties. This is used for custom procedural objects (cubes, spheres, targets) and for loading mesh-based assets from files. The builder produces an
Actorobject that is tracked across all parallel sub-scenes.
- Scene builder pattern (SceneBuilder / TableSceneBuilder): Pre-built scene templates that encapsulate common workspace setups.
TableSceneBuilder, for example, loads a table mesh, positions it so thatz=0is at the table surface, builds a ground plane, and provides aninitialize()method that sets reasonable robot initial configurations for many supported robots. Scene builders handle the boilerplate of workspace construction so that task developers can focus on task-specific objects.
Each actor or articulation built into the scene is automatically replicated across all parallel sub-scenes (GPU environments). The ActorBuilder supports selective instantiation via set_scene_idxs(), which restricts the actor to specific parallel environments -- useful for tasks that load different assets in different environments.
Objects in ManiSkill have three physics body types:
- dynamic: Objects that respond to forces and can be moved by the robot or other objects. Used for manipulation targets (cubes, bottles, etc.).
- kinematic: Objects that can be programmatically moved but are not affected by physics forces. Used for goal markers, movable platforms, and animated obstacles.
- static: Objects that are completely fixed in place. Used for tables, walls, and other immovable fixtures.
Usage
Scene and object loading is performed inside the _load_scene() method of a custom task. This method is called during reconfiguration (typically at the first reset() call). The developer:
- Optionally instantiates a
SceneBuilder(e.g.,TableSceneBuilder) and calls its.build()method to create the workspace. - Creates additional actors via
self.scene.create_actor_builder()or convenience functions likeactors.build_cube(). - Loads articulated objects (robots are loaded separately in
_load_agent(); task articulations like faucets or drawers are loaded here). - Stores references to key objects as instance attributes (e.g.,
self.obj,self.goal_region) for later use in initialization, reward, and observation methods.
All objects must have unique names within the scene. Setting reasonable initial poses is recommended to prevent physics instabilities during GPU simulation setup.
Theoretical Basis
The scene construction approach in ManiSkill is grounded in several design patterns and simulation concepts:
- Builder design pattern: The
ActorBuilderfollows the classic builder pattern from object-oriented design. Rather than constructing a complex object in a single constructor call, the builder accumulates configuration (collision shapes, visual shapes, physics type) through a sequence of method calls, then produces the final object via a terminal.build()call. This enables flexible, readable object construction.
- Scene graph architecture: The simulation scene is organized as a hierarchical structure where the global
ManiSkillScenemanages multiple SAPIEN sub-scenes (one per parallel environment in GPU simulation, or a single one in CPU simulation). Actors and articulations are tracked at the scene level, enabling batched operations across all environments.
- URDF/MJCF asset loading: Articulated objects (robots, mechanisms) are loaded from standard robotics description formats -- URDF (Unified Robot Description Format) and MJCF (MuJoCo XML). These formats describe link geometries, joint types, and physical properties in a declarative way, decoupling asset authoring from simulation code.
- Physics body type taxonomy: The distinction between dynamic, kinematic, and static objects is fundamental to rigid-body physics simulation (PhysX). Correct classification affects simulation performance (static objects are optimized away from the solver) and correctness (kinematic objects must be moved programmatically).
- Template method pattern: The
SceneBuilderclasses use the template method pattern --build()creates the scene structure, andinitialize()sets per-episode configurations. Subclasses can override these to customize workspace geometry while inheriting robot-specific initialization logic.
Related Pages
- Implementation:Haosulab_ManiSkill_ActorBuilder_TableSceneBuilder -- Concrete builder implementations
- Principle:Haosulab_ManiSkill_Environment_Registration -- Registering the environment before loading scenes
- Principle:Haosulab_ManiSkill_Episode_Initialization -- Randomizing object poses after scene loading
- Principle:Haosulab_ManiSkill_Observation_Definition -- Observing the loaded scene
- Heuristic:Haosulab_ManiSkill_Initial_Pose_Performance