Implementation:ARISE Initiative Robomimic FileUtils download file from hf
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Data_Pipeline, Data_Acquisition |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for downloading demonstration dataset files from HuggingFace Hub provided by the robomimic file utilities module.
Description
The download_file_from_hf function downloads an HDF5 dataset file from a HuggingFace repository. It uses a temporary directory for the initial download (since hf_hub_download returns a cached pointer), then moves the resolved file to the target directory. It includes an overwrite check that prompts the user before replacing existing files.
For real-world datasets not on HuggingFace, the companion function download_url (at file_utils.py:L520-553) provides direct URL download with a progress bar.
Usage
Call this function when downloading benchmark datasets from HuggingFace. It is used by the download_datasets.py CLI script which resolves the DATASET_REGISTRY to get the correct repo_id and filename.
Code Reference
Source Location
- Repository: robomimic
- File: robomimic/utils/file_utils.py
- Lines: L555-580
Signature
def download_file_from_hf(repo_id, filename, download_dir, check_overwrite=True):
"""
Downloads a file from Hugging Face.
Args:
repo_id (str): Hugging Face repo ID (e.g., "robomimic/robomimic_datasets")
filename (str): path to file in repo (e.g., "lift/ph/low_dim_v141.hdf5")
download_dir (str): path to directory where file should be downloaded
check_overwrite (bool): if True, prompt before overwriting existing files
"""
Import
import robomimic.utils.file_utils as FileUtils
# Call as:
FileUtils.download_file_from_hf(
repo_id="robomimic/robomimic_datasets",
filename="lift/ph/low_dim_v141.hdf5",
download_dir="/path/to/datasets",
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo_id | str | Yes | HuggingFace repository ID (default: "robomimic/robomimic_datasets") |
| filename | str | Yes | Path to file within the HuggingFace repo |
| download_dir | str | Yes | Local directory for the downloaded file |
| check_overwrite | bool | No | Prompt before overwriting. Default: True |
Outputs
| Name | Type | Description |
|---|---|---|
| (side effect) | None | Downloads HDF5 file to download_dir/basename(filename) |
Usage Examples
Download via CLI Script
# Download low-dim Lift dataset (proficient-human)
python robomimic/scripts/download_datasets.py --task lift --dataset_type ph --hdf5_type low_dim
# Download image Square dataset (multi-human)
python robomimic/scripts/download_datasets.py --task square --dataset_type mh --hdf5_type image
Programmatic Download
import robomimic.utils.file_utils as FileUtils
import robomimic
# Get download info from registry
task = "lift"
dataset_type = "ph"
hdf5_type = "low_dim"
ds_info = robomimic.DATASET_REGISTRY[task][dataset_type][hdf5_type]
# Download from HuggingFace
FileUtils.download_file_from_hf(
repo_id=robomimic.HF_REPO_ID,
filename=ds_info["url"],
download_dir="./datasets/lift/ph/",
check_overwrite=True,
)