Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Haosulab ManiSkill Convert To LeRobot CLI

From Leeroopedia
Field Value
Source Repository haosulab/ManiSkill
Type External Tool Doc
Domains Imitation_Learning, Robotics, Data_Processing, Interoperability
Last Updated 2026-02-15

Overview

Description

The Convert To LeRobot CLI is a command-line tool for converting ManiSkill HDF5 trajectory files into the LeRobot v3.0 dataset format. It reads a ManiSkill .h5 trajectory file (and its companion .json metadata file), automatically detects available RGB cameras, robot state dimensions, and action dimensions, and produces a complete LeRobot-format directory with Parquet data files, MP4 video chunks, per-episode metadata, global statistics, and a self-describing info.json.

The conversion pipeline performs the following steps:

  1. Load trajectories and metadata from the input HDF5 file
  2. Auto-detect RGB cameras (from obs/sensor_data/*/rgb), robot state (from obs/agent/qpos), and action dimensions
  3. Create the output directory structure with data chunks, video directories, and metadata directories
  4. For each episode: extract actions, robot states, and camera frames; write tabular data to Parquet; encode camera frames as MP4 videos
  5. Compute per-episode and global statistics for all numerical fields
  6. Generate metadata files: info.json (dataset schema), stats.json (global statistics), episodes/ (per-episode metadata), and tasks.parquet (task descriptions)

Usage

This tool is used after trajectory replay/conversion (to ensure the trajectory contains RGB image observations) and when the goal is to export ManiSkill data for use in the LeRobot ecosystem.

Code Reference

Source Location

Field Value
Repository haosulab/ManiSkill
File mani_skill/trajectory/convert_to_lerobot.py
Lines (Args) L34-56
Lines (main) L470-584

Signature

CLI invocation:

python -m mani_skill.trajectory.convert_to_lerobot \
    --traj-path <path_to_h5> \
    --output-dir <output_directory> \
    [--fps FPS] \
    [--task-name TASK_NAME] \
    [--chunks-size SIZE] \
    [--image-size WIDTHxHEIGHT] \
    [--robot-type TYPE]

Args dataclass (L34-56):

@dataclass
class Args:
    traj_path: str
    """Path to ManiSkill .h5 trajectory file"""

    output_dir: str
    """Output directory for LeRobot dataset"""

    fps: int = 30
    """Video FPS (default: 30)"""

    task_name: Optional[str] = None
    """Task description (default: auto-detected from metadata)"""

    chunks_size: int = 1000
    """Episodes per chunk (default: 1000)"""

    image_size: str = "640x480"
    """Output image size as WIDTHxHEIGHT or single value for square (default: 640x480)"""

    robot_type: Optional[str] = None
    """Robot type (default: auto-detected, e.g., 'panda', 'ur5')"""

Key parameters:

Parameter Type Default Description
traj_path str required Path to the input ManiSkill .h5 trajectory file. A companion .json file must exist at the same location.
output_dir str required Output directory for the LeRobot dataset. Will be created if it does not exist.
fps int 30 Frames per second for video encoding. Should match the control frequency of the simulation.
task_name Optional[str] None (auto-detected) Human-readable task description. If not provided, auto-detected from the environment ID in the trajectory metadata.
chunks_size int 1000 Number of episodes per data chunk. Controls the granularity of data file partitioning.
image_size str "640x480" Output image/video resolution as WIDTHxHEIGHT or a single integer for square images. Images are resized with aspect-preserving padding.
robot_type Optional[str] None (auto-detected) Robot type string (e.g., "panda", "ur5"). If not provided, inferred from the environment ID.

Import

This tool is primarily used from the command line:

python -m mani_skill.trajectory.convert_to_lerobot --traj-path trajectory.h5 --output-dir ./lerobot_dataset

For programmatic use:

from mani_skill.trajectory.convert_to_lerobot import main, Args
import tyro

args = tyro.cli(Args, args=[
    "--traj-path", "trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5",
    "--output-dir", "./lerobot_output",
    "--task-name", "Pick Cube",
])
main(args)

I/O Contract

Inputs:

Input Type Description
traj_path str (file path) Path to a ManiSkill .h5 trajectory file. Should contain RGB observations (in obs/sensor_data/{camera}/rgb) for video generation. A companion .json metadata file must exist.

Outputs (LeRobot v3.0 directory structure):

output_dir/
  data/
    chunk-000/
      file-000.parquet     # Tabular data: actions, states, timestamps, indices
    chunk-001/
      ...
  videos/
    observation.images.{camera_name}/
      chunk-000/
        file-000.mp4       # Episode 0 video for this camera
        file-001.mp4       # Episode 1 video
        ...
  meta/
    info.json              # Dataset schema, features, robot type, FPS
    stats.json             # Global statistics (mean, std, min, max)
    tasks.parquet          # Task index -> task name mapping
    episodes/
      chunk-000/
        file-000.parquet   # Per-episode metadata and statistics

Parquet data columns:

Column Type Description
action list[float32] Action vector for this timestep
observation.state list[float32] Robot joint positions (qpos) if available
timestamp float32 Time in seconds (frame_index / fps)
frame_index int64 Zero-indexed frame within the episode
episode_index int64 Episode index across the dataset
index int64 Global frame index across all episodes
task_index int64 Task index (maps to tasks.parquet)
task string Task description string

info.json schema (key fields):

{
    "codebase_version": "v3.0",
    "robot_type": "panda",         # auto-detected or user-specified
    "total_episodes": 100,
    "total_frames": 15000,
    "total_tasks": 1,
    "total_videos": 100,           # episodes * cameras
    "chunks_size": 1000,
    "fps": 30,
    "features": {
        "action": {"dtype": "float32", "shape": [7]},
        "observation.state": {"dtype": "float32", "shape": [9]},
        "observation.images.base_camera": {"dtype": "video", "shape": [480, 640, 3]},
        ...
    }
}

Usage Examples

Example 1: Basic conversion with default settings

python -m mani_skill.trajectory.convert_to_lerobot \
    --traj-path ~/.maniskill/demos/PickCube-v1/trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
    --output-dir ./lerobot_pickcube

Example 2: Custom task name and FPS

python -m mani_skill.trajectory.convert_to_lerobot \
    --traj-path trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
    --output-dir ./lerobot_dataset \
    --task-name "Pick up the red cube and place it on the target" \
    --fps 20

Example 3: Custom image size and robot type

python -m mani_skill.trajectory.convert_to_lerobot \
    --traj-path trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
    --output-dir ./lerobot_dataset \
    --image-size 256x256 \
    --robot-type panda \
    --chunks-size 500

Example 4: End-to-end pipeline from download to LeRobot export

# Step 1: Download demonstrations
python -m mani_skill.utils.download_demo PickCube-v1

# Step 2: Replay with RGBD observations
python -m mani_skill.trajectory.replay_trajectory \
    --traj-path ~/.maniskill/demos/PickCube-v1/trajectory.h5 \
    -o rgbd \
    -c pd_joint_delta_pos \
    --save-traj

# Step 3: Convert to LeRobot format
python -m mani_skill.trajectory.convert_to_lerobot \
    --traj-path ~/.maniskill/demos/PickCube-v1/trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5 \
    --output-dir ./lerobot_pickcube \
    --task-name "Pick Cube"

Example 5: Programmatic conversion

from mani_skill.trajectory.convert_to_lerobot import main, Args

args = Args(
    traj_path="trajectory.rgbd.pd_joint_delta_pos.physx_cpu.h5",
    output_dir="./lerobot_output",
    fps=30,
    task_name="Stack Cube",
    chunks_size=1000,
    image_size="640x480",
    robot_type="panda",
)
exit_code = main(args)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment