Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:ARISE Initiative Robomimic FileUtils download file from hf

From Leeroopedia
Knowledge Sources
Domains Robotics, Data_Pipeline, Data_Acquisition
Last Updated 2026-02-15 08:00 GMT

Overview

Concrete tool for downloading demonstration dataset files from HuggingFace Hub provided by the robomimic file utilities module.

Description

The download_file_from_hf function downloads an HDF5 dataset file from a HuggingFace repository. It uses a temporary directory for the initial download (since hf_hub_download returns a cached pointer), then moves the resolved file to the target directory. It includes an overwrite check that prompts the user before replacing existing files.

For real-world datasets not on HuggingFace, the companion function download_url (at file_utils.py:L520-553) provides direct URL download with a progress bar.

Usage

Call this function when downloading benchmark datasets from HuggingFace. It is used by the download_datasets.py CLI script which resolves the DATASET_REGISTRY to get the correct repo_id and filename.

Code Reference

Source Location

  • Repository: robomimic
  • File: robomimic/utils/file_utils.py
  • Lines: L555-580

Signature

def download_file_from_hf(repo_id, filename, download_dir, check_overwrite=True):
    """
    Downloads a file from Hugging Face.

    Args:
        repo_id (str): Hugging Face repo ID (e.g., "robomimic/robomimic_datasets")
        filename (str): path to file in repo (e.g., "lift/ph/low_dim_v141.hdf5")
        download_dir (str): path to directory where file should be downloaded
        check_overwrite (bool): if True, prompt before overwriting existing files
    """

Import

import robomimic.utils.file_utils as FileUtils

# Call as:
FileUtils.download_file_from_hf(
    repo_id="robomimic/robomimic_datasets",
    filename="lift/ph/low_dim_v141.hdf5",
    download_dir="/path/to/datasets",
)

I/O Contract

Inputs

Name Type Required Description
repo_id str Yes HuggingFace repository ID (default: "robomimic/robomimic_datasets")
filename str Yes Path to file within the HuggingFace repo
download_dir str Yes Local directory for the downloaded file
check_overwrite bool No Prompt before overwriting. Default: True

Outputs

Name Type Description
(side effect) None Downloads HDF5 file to download_dir/basename(filename)

Usage Examples

Download via CLI Script

# Download low-dim Lift dataset (proficient-human)
python robomimic/scripts/download_datasets.py --task lift --dataset_type ph --hdf5_type low_dim

# Download image Square dataset (multi-human)
python robomimic/scripts/download_datasets.py --task square --dataset_type mh --hdf5_type image

Programmatic Download

import robomimic.utils.file_utils as FileUtils
import robomimic

# Get download info from registry
task = "lift"
dataset_type = "ph"
hdf5_type = "low_dim"
ds_info = robomimic.DATASET_REGISTRY[task][dataset_type][hdf5_type]

# Download from HuggingFace
FileUtils.download_file_from_hf(
    repo_id=robomimic.HF_REPO_ID,
    filename=ds_info["url"],
    download_dir="./datasets/lift/ph/",
    check_overwrite=True,
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment