Implementation:ARISE Initiative Robomimic Split train val from hdf5
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Data_Pipeline, Data_Splitting |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for splitting HDF5 demonstration datasets into training and validation subsets using filter keys provided by the robomimic split_train_val script.
Description
The split_train_val_from_hdf5 function creates randomized train/valid splits at the demonstration level. It reads available demonstrations (from the full dataset or a specified filter key subset), randomly assigns them to train and valid sets based on the specified ratio, and writes the resulting demo lists as filter keys in the HDF5 mask/ group via create_hdf5_filter_key.
Usage
Run as a CLI script after observation extraction. The resulting "train" and "valid" (or "{filter_key}_train" / "{filter_key}_valid") keys are used by the training pipeline's dataset loading step.
Code Reference
Source Location
- Repository: robomimic
- File: robomimic/scripts/split_train_val.py
- Lines: L25-76
Signature
def split_train_val_from_hdf5(hdf5_path, val_ratio=0.1, filter_key=None):
"""
Splits data into training set and validation set from HDF5 file.
Args:
hdf5_path (str): path to the hdf5 file to load the transitions from
val_ratio (float): ratio of validation demonstrations to all demonstrations
filter_key (str): if provided, split the subset of demonstration keys stored
under mask/@filter_key instead of the full set of demonstrations
"""
Import
from robomimic.scripts.split_train_val import split_train_val_from_hdf5
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| hdf5_path | str | Yes | Path to HDF5 dataset file |
| val_ratio | float | No | Fraction of demos for validation. Default: 0.1 |
| filter_key | str | No | If provided, split only demos in mask/{filter_key} |
Outputs
| Name | Type | Description |
|---|---|---|
| (side effect) | None | Writes "train" and "valid" filter keys (or "{filter_key}_train" / "{filter_key}_valid") into HDF5 mask/ group |
Usage Examples
CLI Usage
# Split with default 10% validation
python robomimic/scripts/split_train_val.py --dataset /path/to/low_dim_v141.hdf5
# Split with 20% validation
python robomimic/scripts/split_train_val.py --dataset /path/to/low_dim_v141.hdf5 --ratio 0.2
# Split a subset
python robomimic/scripts/split_train_val.py --dataset /path/to/low_dim_v141.hdf5 --filter_key 50_demos
Programmatic Usage
from robomimic.scripts.split_train_val import split_train_val_from_hdf5
split_train_val_from_hdf5(
hdf5_path="/path/to/low_dim_v141.hdf5",
val_ratio=0.1,
)
# Creates mask/train and mask/valid in the HDF5 file