| Field |
Value
|
| Implementation Name |
Merge Trajectories Func
|
| Type |
API Doc
|
| Domain |
Motion_Planning
|
| Source File |
mani_skill/trajectory/merge_trajectory.py (L9-75)
|
| Date |
2026-02-15
|
| Repository |
Haosulab/ManiSkill
|
Overview
The merge_trajectories function combines multiple HDF5 trajectory files and their companion JSON metadata files into a single consolidated dataset. This is the final step of the parallel trajectory generation pipeline and can also be used as a standalone tool for combining datasets from different recording sessions.
Description
The function iterates through each input trajectory file, copies all HDF5 episode groups to the output file, and merges the JSON episode metadata. By default, it renumbers episode IDs consecutively to produce a contiguous ID space. It also preserves global metadata (environment info, commit info, source descriptions) from the first input file, logging warnings if subsequent files have conflicting values.
Usage
from mani_skill.trajectory.merge_trajectory import merge_trajectories
merge_trajectories(
output_path="demos/merged.h5",
traj_paths=["demos/batch.0.h5", "demos/batch.1.h5"],
recompute_id=True,
)
Code Reference
Function Signature
def merge_trajectories(
output_path: str,
traj_paths: list,
recompute_id: bool = True,
) -> None:
Parameters
| Parameter |
Type |
Default |
Description
|
output_path |
str |
(required) |
Path for the output HDF5 file. The JSON file is saved at the same path with .json extension.
|
traj_paths |
list |
(required) |
List of paths to input HDF5 trajectory files. Each must have a companion .json file.
|
recompute_id |
bool |
True |
If True, renumber episode IDs consecutively starting from 0. If False, keep original IDs (asserts no conflicts).
|
Implementation (L9-75)
def merge_trajectories(output_path: str, traj_paths: list, recompute_id: bool = True):
logger.info(f"Merging {output_path}")
merged_h5_file = h5py.File(output_path, "w")
merged_json_path = output_path.replace(".h5", ".json")
merged_json_data = {"episodes": []}
cnt = 0
for traj_path in traj_paths:
traj_path = str(traj_path)
logger.info(f"Merging{traj_path}")
with h5py.File(traj_path, "r") as h5_file:
json_data = load_json(traj_path.replace(".h5", ".json"))
# For keys other than episodes, keep the first data
for key, value in json_data.items():
if key == "episodes":
continue
if key not in merged_json_data:
merged_json_data[key] = value
else:
if merged_json_data[key] != value:
logger.warning(
f"Conflict detected for key {key} in {traj_path}"
)
# Merge episodes
for ep in json_data["episodes"]:
episode_id = ep["episode_id"]
traj_id = f"traj_{episode_id}"
if recompute_id:
new_traj_id = f"traj_{cnt}"
else:
new_traj_id = traj_id
assert new_traj_id not in merged_h5_file, new_traj_id
h5_file.copy(traj_id, merged_h5_file, new_traj_id)
if recompute_id:
ep["episode_id"] = cnt
merged_json_data["episodes"].append(ep)
cnt += 1
merged_h5_file.close()
dump_json(merged_json_path, merged_json_data, indent=2)
CLI Interface (L78-97)
# Command-line usage:
# python -m mani_skill.trajectory.merge_trajectory \
# -i dir1 dir2 -o output/merged.h5 -p "*.h5"
| Argument |
Description
|
-i / --input-dirs |
Input directories to search for trajectory files.
|
-o / --output-path |
Path for the merged output HDF5 file.
|
-p / --pattern |
Glob pattern to match trajectory files (default: trajectory.h5).
|
I/O Contract
| Direction |
Data |
Format
|
| Input |
List of HDF5 trajectory files |
Each containing traj_0, traj_1, ... groups
|
| Input |
Companion JSON files |
Same basename as HDF5 with .json extension
|
| Output |
Merged HDF5 file |
Single file with consecutively numbered traj_0, traj_1, ... groups
|
| Output |
Merged JSON file |
Combined episode metadata with renumbered IDs
|
Merge Behavior
| JSON Key |
Merge Strategy
|
episodes |
Concatenated from all input files; IDs renumbered if recompute_id=True
|
env_info |
Kept from the first file; warnings logged for conflicts
|
commit_info |
Kept from the first file; warnings logged for conflicts
|
source_type |
Kept from the first file; warnings logged for conflicts
|
source_desc |
Kept from the first file; warnings logged for conflicts
|
Usage Examples
# Merge trajectory files from a parallel generation run
from mani_skill.trajectory.merge_trajectory import merge_trajectories
traj_files = [
"demos/PickCube-v1/motionplanning/20260215.0.h5",
"demos/PickCube-v1/motionplanning/20260215.1.h5",
"demos/PickCube-v1/motionplanning/20260215.2.h5",
"demos/PickCube-v1/motionplanning/20260215.3.h5",
]
merge_trajectories(
output_path="demos/PickCube-v1/motionplanning/20260215.h5",
traj_paths=traj_files,
recompute_id=True,
)
# Result: single 20260215.h5 with traj_0 through traj_N
# CLI usage to merge all trajectory files from multiple directories
python -m mani_skill.trajectory.merge_trajectory \
-i demos/run1 demos/run2 demos/run3 \
-o demos/all_merged/trajectory.h5 \
-p "*.h5"
Related Pages