Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Haosulab ManiSkill Merge Trajectories Func

From Leeroopedia
Field Value
Implementation Name Merge Trajectories Func
Type API Doc
Domain Motion_Planning
Source File mani_skill/trajectory/merge_trajectory.py (L9-75)
Date 2026-02-15
Repository Haosulab/ManiSkill

Overview

The merge_trajectories function combines multiple HDF5 trajectory files and their companion JSON metadata files into a single consolidated dataset. This is the final step of the parallel trajectory generation pipeline and can also be used as a standalone tool for combining datasets from different recording sessions.

Description

The function iterates through each input trajectory file, copies all HDF5 episode groups to the output file, and merges the JSON episode metadata. By default, it renumbers episode IDs consecutively to produce a contiguous ID space. It also preserves global metadata (environment info, commit info, source descriptions) from the first input file, logging warnings if subsequent files have conflicting values.

Usage

from mani_skill.trajectory.merge_trajectory import merge_trajectories

merge_trajectories(
    output_path="demos/merged.h5",
    traj_paths=["demos/batch.0.h5", "demos/batch.1.h5"],
    recompute_id=True,
)

Code Reference

Function Signature

def merge_trajectories(
    output_path: str,
    traj_paths: list,
    recompute_id: bool = True,
) -> None:

Parameters

Parameter Type Default Description
output_path str (required) Path for the output HDF5 file. The JSON file is saved at the same path with .json extension.
traj_paths list (required) List of paths to input HDF5 trajectory files. Each must have a companion .json file.
recompute_id bool True If True, renumber episode IDs consecutively starting from 0. If False, keep original IDs (asserts no conflicts).

Implementation (L9-75)

def merge_trajectories(output_path: str, traj_paths: list, recompute_id: bool = True):
    logger.info(f"Merging {output_path}")
    merged_h5_file = h5py.File(output_path, "w")
    merged_json_path = output_path.replace(".h5", ".json")
    merged_json_data = {"episodes": []}
    cnt = 0

    for traj_path in traj_paths:
        traj_path = str(traj_path)
        logger.info(f"Merging{traj_path}")

        with h5py.File(traj_path, "r") as h5_file:
            json_data = load_json(traj_path.replace(".h5", ".json"))

            # For keys other than episodes, keep the first data
            for key, value in json_data.items():
                if key == "episodes":
                    continue
                if key not in merged_json_data:
                    merged_json_data[key] = value
                else:
                    if merged_json_data[key] != value:
                        logger.warning(
                            f"Conflict detected for key {key} in {traj_path}"
                        )

            # Merge episodes
            for ep in json_data["episodes"]:
                episode_id = ep["episode_id"]
                traj_id = f"traj_{episode_id}"

                if recompute_id:
                    new_traj_id = f"traj_{cnt}"
                else:
                    new_traj_id = traj_id

                assert new_traj_id not in merged_h5_file, new_traj_id
                h5_file.copy(traj_id, merged_h5_file, new_traj_id)

                if recompute_id:
                    ep["episode_id"] = cnt
                merged_json_data["episodes"].append(ep)
                cnt += 1

    merged_h5_file.close()
    dump_json(merged_json_path, merged_json_data, indent=2)

CLI Interface (L78-97)

# Command-line usage:
# python -m mani_skill.trajectory.merge_trajectory \
#     -i dir1 dir2 -o output/merged.h5 -p "*.h5"
Argument Description
-i / --input-dirs Input directories to search for trajectory files.
-o / --output-path Path for the merged output HDF5 file.
-p / --pattern Glob pattern to match trajectory files (default: trajectory.h5).

I/O Contract

Direction Data Format
Input List of HDF5 trajectory files Each containing traj_0, traj_1, ... groups
Input Companion JSON files Same basename as HDF5 with .json extension
Output Merged HDF5 file Single file with consecutively numbered traj_0, traj_1, ... groups
Output Merged JSON file Combined episode metadata with renumbered IDs

Merge Behavior

JSON Key Merge Strategy
episodes Concatenated from all input files; IDs renumbered if recompute_id=True
env_info Kept from the first file; warnings logged for conflicts
commit_info Kept from the first file; warnings logged for conflicts
source_type Kept from the first file; warnings logged for conflicts
source_desc Kept from the first file; warnings logged for conflicts

Usage Examples

# Merge trajectory files from a parallel generation run
from mani_skill.trajectory.merge_trajectory import merge_trajectories

traj_files = [
    "demos/PickCube-v1/motionplanning/20260215.0.h5",
    "demos/PickCube-v1/motionplanning/20260215.1.h5",
    "demos/PickCube-v1/motionplanning/20260215.2.h5",
    "demos/PickCube-v1/motionplanning/20260215.3.h5",
]
merge_trajectories(
    output_path="demos/PickCube-v1/motionplanning/20260215.h5",
    traj_paths=traj_files,
    recompute_id=True,
)
# Result: single 20260215.h5 with traj_0 through traj_N
# CLI usage to merge all trajectory files from multiple directories
python -m mani_skill.trajectory.merge_trajectory \
    -i demos/run1 demos/run2 demos/run3 \
    -o demos/all_merged/trajectory.h5 \
    -p "*.h5"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment