Implementation:Haosulab ManiSkill Convert To LeRobot CLI
| Field | Value |
|---|---|
| Source Repository | haosulab/ManiSkill |
| Type | External Tool Doc |
| Domains | Imitation_Learning, Robotics, Data_Processing, Interoperability |
| Last Updated | 2026-02-15 |
Overview
Description
The Convert To LeRobot CLI is a command-line tool for converting ManiSkill HDF5 trajectory files into the LeRobot v3.0 dataset format. It reads a ManiSkill .h5 trajectory file (and its companion .json metadata file), automatically detects available RGB cameras, robot state dimensions, and action dimensions, and produces a complete LeRobot-format directory with Parquet data files, MP4 video chunks, per-episode metadata, global statistics, and a self-describing info.json.
The conversion pipeline performs the following steps:
- Load trajectories and metadata from the input HDF5 file
- Auto-detect RGB cameras (from
obs/sensor_data/*/rgb), robot state (fromobs/agent/qpos), and action dimensions - Create the output directory structure with data chunks, video directories, and metadata directories
- For each episode: extract actions, robot states, and camera frames; write tabular data to Parquet; encode camera frames as MP4 videos
- Compute per-episode and global statistics for all numerical fields
- Generate metadata files:
info.json(dataset schema),stats.json(global statistics),episodes/(per-episode metadata), andtasks.parquet(task descriptions)
Usage
This tool is used after trajectory replay/conversion (to ensure the trajectory contains RGB image observations) and when the goal is to export ManiSkill data for use in the LeRobot ecosystem.
Code Reference
Source Location
| Field | Value |
|---|---|
| Repository | haosulab/ManiSkill |
| File | mani_skill/trajectory/convert_to_lerobot.py
|
| Lines (Args) | L34-56 |
| Lines (main) | L470-584 |
Signature
CLI invocation:
python -m mani_skill.trajectory.convert_to_lerobot \
--traj-path <path_to_h5> \
--output-dir <output_directory> \
[--fps FPS] \
[--task-name TASK_NAME] \
[--chunks-size SIZE] \
[--image-size WIDTHxHEIGHT] \
[--robot-type TYPE]
Args dataclass (L34-56):
@dataclass
class Args:
traj_path: str
"""Path to ManiSkill .h5 trajectory file"""
output_dir: str
"""Output directory for LeRobot dataset"""
fps: int = 30
"""Video FPS (default: 30)"""
task_name: Optional[str] = None
"""Task description (default: auto-detected from metadata)"""
chunks_size: int = 1000
"""Episodes per chunk (default: 1000)"""
image_size: str = "640x480"
"""Output image size as WIDTHxHEIGHT or single value for square (default: 640x480)"""
robot_type: Optional[str] = None
"""Robot type (default: auto-detected, e.g., 'panda', 'ur5')"""
Key parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
traj_path |
str | required | Path to the input ManiSkill .h5 trajectory file. A companion .json file must exist at the same location.
|
output_dir |
str | required | Output directory for the LeRobot dataset. Will be created if it does not exist. |
fps |
int | 30 | Frames per second for video encoding. Should match the control frequency of the simulation. |
task_name |
Optional[str] | None (auto-detected) | Human-readable task description. If not provided, auto-detected from the environment ID in the trajectory metadata. |
chunks_size |
int | 1000 | Number of episodes per data chunk. Controls the granularity of data file partitioning. |
image_size |
str | "640x480" | Output image/video resolution as WIDTHxHEIGHT or a single integer for square images. Images are resized with aspect-preserving padding.
|
robot_type |
Optional[str] | None (auto-detected) | Robot type string (e.g., "panda", "ur5"). If not provided, inferred from the environment ID. |
Import
This tool is primarily used from the command line:
python -m mani_skill.trajectory.convert_to_lerobot --traj-path trajectory.h5 --output-dir ./lerobot_dataset
For programmatic use:
from mani_skill.trajectory.convert_to_lerobot import main, Args
import tyro
args = tyro.cli(Args, args=[
"--traj-path", "trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5",
"--output-dir", "./lerobot_output",
"--task-name", "Pick Cube",
])
main(args)
I/O Contract
Inputs:
| Input | Type | Description |
|---|---|---|
traj_path |
str (file path) | Path to a ManiSkill .h5 trajectory file. Should contain RGB observations (in obs/sensor_data/{camera}/rgb) for video generation. A companion .json metadata file must exist.
|
Outputs (LeRobot v3.0 directory structure):
output_dir/
data/
chunk-000/
file-000.parquet # Tabular data: actions, states, timestamps, indices
chunk-001/
...
videos/
observation.images.{camera_name}/
chunk-000/
file-000.mp4 # Episode 0 video for this camera
file-001.mp4 # Episode 1 video
...
meta/
info.json # Dataset schema, features, robot type, FPS
stats.json # Global statistics (mean, std, min, max)
tasks.parquet # Task index -> task name mapping
episodes/
chunk-000/
file-000.parquet # Per-episode metadata and statistics
Parquet data columns:
| Column | Type | Description |
|---|---|---|
action |
list[float32] | Action vector for this timestep |
observation.state |
list[float32] | Robot joint positions (qpos) if available |
timestamp |
float32 | Time in seconds (frame_index / fps) |
frame_index |
int64 | Zero-indexed frame within the episode |
episode_index |
int64 | Episode index across the dataset |
index |
int64 | Global frame index across all episodes |
task_index |
int64 | Task index (maps to tasks.parquet) |
task |
string | Task description string |
info.json schema (key fields):
{
"codebase_version": "v3.0",
"robot_type": "panda", # auto-detected or user-specified
"total_episodes": 100,
"total_frames": 15000,
"total_tasks": 1,
"total_videos": 100, # episodes * cameras
"chunks_size": 1000,
"fps": 30,
"features": {
"action": {"dtype": "float32", "shape": [7]},
"observation.state": {"dtype": "float32", "shape": [9]},
"observation.images.base_camera": {"dtype": "video", "shape": [480, 640, 3]},
...
}
}
Usage Examples
Example 1: Basic conversion with default settings
python -m mani_skill.trajectory.convert_to_lerobot \
--traj-path ~/.maniskill/demos/PickCube-v1/trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
--output-dir ./lerobot_pickcube
Example 2: Custom task name and FPS
python -m mani_skill.trajectory.convert_to_lerobot \
--traj-path trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
--output-dir ./lerobot_dataset \
--task-name "Pick up the red cube and place it on the target" \
--fps 20
Example 3: Custom image size and robot type
python -m mani_skill.trajectory.convert_to_lerobot \
--traj-path trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
--output-dir ./lerobot_dataset \
--image-size 256x256 \
--robot-type panda \
--chunks-size 500
Example 4: End-to-end pipeline from download to LeRobot export
# Step 1: Download demonstrations
python -m mani_skill.utils.download_demo PickCube-v1
# Step 2: Replay with RGBD observations
python -m mani_skill.trajectory.replay_trajectory \
--traj-path ~/.maniskill/demos/PickCube-v1/trajectory.h5 \
-o rgbd \
-c pd_joint_delta_pos \
--save-traj
# Step 3: Convert to LeRobot format
python -m mani_skill.trajectory.convert_to_lerobot \
--traj-path ~/.maniskill/demos/PickCube-v1/trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
--output-dir ./lerobot_pickcube \
--task-name "Pick Cube"
Example 5: Programmatic conversion
from mani_skill.trajectory.convert_to_lerobot import main, Args
args = Args(
traj_path="trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5",
output_dir="./lerobot_output",
fps=30,
task_name="Stack Cube",
chunks_size=1000,
image_size="640x480",
robot_type="panda",
)
exit_code = main(args)
Related Pages
- Principle:Haosulab_ManiSkill_LeRobot_Format_Export -- The principle describing LeRobot format export and cross-framework interoperability.
- Implementation:Haosulab_ManiSkill_Replay_Trajectory_CLI -- The preceding step: replaying trajectories with RGBD observations for video export.
- Implementation:Haosulab_ManiSkill_Download_Demo_CLI -- The first step: downloading raw demonstration data.
- Implementation:Haosulab_ManiSkill_ManiSkillTrajectoryDataset -- Alternative path: loading data directly for ManiSkill-native training.