Principle:Haosulab ManiSkill Physics Struct Abstraction
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Simulation, GPU_Computing |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
A physics struct abstraction wraps individual simulation engine objects into batched data structures that support vectorized property access across thousands of parallel environment instances, bridging the gap between object-oriented physics APIs and tensor-oriented GPU computation.
Description
Modern robotics simulation achieves high throughput by running thousands of environment instances in parallel on a GPU. However, physics engines like SAPIEN/PhysX expose an object-oriented API: each actor, articulation, joint, and link is an individual object with getter/setter methods. The Physics Struct Abstraction principle bridges this mismatch by wrapping collections of corresponding objects (e.g., the "cube" actor across all 4096 parallel environments) into a single struct that provides batched tensor access to their properties.
When user code reads cube.pose, the struct returns a batched pose tensor of shape (N, 7) containing the poses of all N instances of that actor. When user code writes cube.pose = new_poses, the struct distributes the tensor values back to the individual engine objects. This happens transparently whether the simulation is running on CPU (where the struct wraps a list of SAPIEN objects) or on GPU (where the struct indexes into a shared GPU tensor managed by the physics engine).
The struct hierarchy mirrors the physics engine's object hierarchy: BaseStruct is the root, Actor wraps rigid bodies, Articulation wraps multi-joint kinematic chains, ArticulationJoint wraps individual joints within an articulation, Link wraps individual links, Drive wraps physical constraints between bodies, Pose wraps SE(3) transformations, and RenderCamera wraps camera sensors. Each struct provides properties and methods appropriate to its physics type while maintaining the batched access pattern.
Usage
This principle applies whenever:
- Task logic, reward computation, or observation extraction needs to read or write physics state (poses, velocities, joint positions) across all parallel environments simultaneously.
- Code must work identically on both CPU and GPU simulation backends without conditional branching.
- The physics engine's per-object API must be abstracted into a tensor-friendly interface for use with PyTorch or NumPy.
- Multiple instances of the same object type must be managed as a coherent collection rather than individually.
Theoretical Basis
1. Struct-of-Arrays Pattern: Traditional object-oriented physics APIs use an Array-of-Structs layout: each object is a struct containing its properties. The batched struct abstraction inverts this to a Struct-of-Arrays layout: a single struct holds arrays (tensors) of each property across all instances. This layout is optimal for GPU computation where operations are performed on entire property vectors at once.
2. Backend Dispatch: Each struct detects whether it is running on CPU or GPU and dispatches property access accordingly:
- CPU mode: The struct wraps a Python list of SAPIEN objects. Reading a property iterates over the list, collects values, and stacks them into a tensor. Writing distributes tensor values back to individual objects.
- GPU mode: The struct indexes into shared GPU tensors managed by the PhysX GPU pipeline. Reading is a tensor slice; writing is a tensor scatter. No Python iteration is needed.
3. Property Merging: When multiple parallel environments contain the same named object (e.g., "cube" in all 4096 environments), the struct merges them into a single entity. The merge operation records the mapping from struct index to (scene_index, object_index) pairs, enabling correct dispatch in both read and write directions.
4. Hierarchical Composition: An Articulation struct contains references to its constituent Link structs and ArticulationJoint structs. Setting a joint position on the Articulation struct correctly propagates to the underlying joint objects. This preserves the semantic richness of the physics hierarchy while maintaining batched access.
5. Lazy Evaluation: Some expensive properties (like computing the full Jacobian matrix) are computed only when requested and may be cached within a simulation step. This avoids redundant computation when the same property is accessed multiple times within a single step.
6. Pose Representation: The Pose struct provides a unified SE(3) representation using position (3D vector) and quaternion (4D, wxyz convention). It supports batched operations: composition, inversion, transformation of points, and conversion to/from 4x4 homogeneous matrices. The Pose struct enables spatial math throughout the framework without exposing raw tensor indexing.
Related Pages
- Implementation:Haosulab_ManiSkill_BaseStruct -- Abstract base class for all batched physics structs.
- Implementation:Haosulab_ManiSkill_Actor -- Batched struct wrapping rigid body actors.
- Implementation:Haosulab_ManiSkill_Articulation -- Batched struct wrapping multi-joint articulated bodies.
- Implementation:Haosulab_ManiSkill_ArticulationJoint -- Batched struct wrapping individual articulation joints.
- Implementation:Haosulab_ManiSkill_Link -- Batched struct wrapping articulation links.
- Implementation:Haosulab_ManiSkill_Pose -- Batched SE(3) pose representation and operations.
- Implementation:Haosulab_ManiSkill_RenderCamera -- Batched struct wrapping render camera sensors.
- Implementation:Haosulab_ManiSkill_Drive -- Batched struct wrapping physical drive constraints.