Principle:Haosulab ManiSkill Trajectory Dataset Loading
| Field | Value |
|---|---|
| Source Repository | haosulab/ManiSkill |
| Domains | Imitation_Learning, Robotics, Data_Processing, Machine_Learning |
| Last Updated | 2026-02-15 |
Overview
Description
Trajectory Dataset Loading is the process of parsing HDF5 trajectory files (produced by data collection or trajectory replay/conversion) into structured PyTorch Dataset objects suitable for supervised learning. This step transforms episode-organized trajectory data -- where each episode is a variable-length sequence of observations, actions, termination signals, and optional reward/success labels -- into a flat, indexable collection of observation-action pairs that can be consumed by standard PyTorch DataLoaders for mini-batch gradient descent training.
In the ManiSkill ecosystem, trajectory data is stored in HDF5 files with a consistent structure: each episode is stored under a key like traj_0, traj_1, etc., and contains arrays for obs (observations), actions, terminated, truncated, and optionally rewards, success, and fail. A companion JSON file provides episode metadata including episode IDs, seeds, control modes, and environment configuration. The Dataset class reads this data, flattens it across episodes, handles optional filtering (e.g., loading only successful episodes), and provides per-timestep indexing.
A critical detail in the loading process is that observations have one more timestep than actions per episode (the terminal observation has no corresponding action), so the final observation of each episode is excluded when building observation-action pairs. This aligns with the standard supervised learning formulation where each training sample is a (state, action) pair.
Usage
Trajectory dataset loading is used after trajectory conversion and before policy training. It bridges the data preprocessing pipeline and the learning algorithm by providing a standard PyTorch Dataset interface. Typical use cases include:
- Loading state-based trajectory data for behavioral cloning with an MLP policy.
- Loading RGBD trajectory data for vision-based policy training.
- Filtering datasets to only include successful episodes for higher-quality training signal.
- Moving data to GPU memory for faster training throughput.
- Serving as a base class or reference implementation for custom dataset classes tailored to specific learning algorithms (e.g., diffusion policy datasets that require observation/action horizons).
Theoretical Basis
Supervised Learning Data Pipelines form the backbone of imitation learning. In behavioral cloning, the learning problem is framed as supervised regression: given a state (observation), predict the expert's action. The dataset must therefore provide (observation, action) pairs in a format compatible with mini-batch stochastic gradient descent.
Key theoretical considerations include:
- Episode-Level Data Organization: Demonstrations are naturally organized as episodes (variable-length sequences). The dataset class must flatten these into a uniform indexable structure while preserving the temporal relationship between observations and actions within each episode.
- Observation-Action Alignment: In MDP formulations, an episode of length T produces T+1 observations (including the terminal state) and T actions. The dataset must correctly align observations with their corresponding actions, typically by discarding the terminal observation.
- Data Type Handling: ManiSkill uses
uint16for certain observation data (such as depth images) to conserve disk space and memory. The dataset must handle type conversion (e.g., toint32orfloat32) for compatibility with PyTorch tensor operations. - Dictionary Observations: When observations include multiple modalities (e.g., agent proprioception, sensor data, extra information), they are stored as nested dictionaries of arrays. The dataset must support indexing into these nested structures.
- Memory Management: For large datasets, the choice of whether to load all data into memory (for speed) or stream from disk (for memory efficiency) is a critical design decision. The ManiSkill trajectory dataset loads all data into memory by default for faster access during training.
- Device Placement: Optionally placing data directly on GPU at load time eliminates per-batch CPU-to-GPU transfer overhead, which can be a significant bottleneck for small models with fast forward passes.
Related Pages
- Implementation:Haosulab_ManiSkill_ManiSkillTrajectoryDataset -- The concrete PyTorch Dataset class that implements trajectory loading.
- Principle:Haosulab_ManiSkill_Trajectory_Replay_Conversion -- The preceding step: converting trajectories to the desired observation and control modes.
- Principle:Haosulab_ManiSkill_Imitation_Policy_Training -- The next step: using the loaded dataset to train imitation learning policies.
- Principle:Haosulab_ManiSkill_Demonstration_Data_Acquisition -- The first step: acquiring raw demonstration data.