Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:ARISE Initiative Robomimic HDF5 Cache Mode Selection

From Leeroopedia
Knowledge Sources
Domains Optimization, Data_Management
Last Updated 2026-02-15 07:30 GMT

Overview

Data loading optimization via HDF5 caching mode selection: use `"all"` for low-dim, `"low_dim"` for image datasets, and never use `None`.

Description

Robomimic's `SequenceDataset` supports three HDF5 caching modes that control how much data is preloaded into RAM. The `"all"` mode loads the entire HDF5 file into memory at startup, providing the fastest data loading during training. The `"low_dim"` mode caches only non-image observation data (state vectors, actions, rewards) while reading images from disk per batch. The `None` mode uses pure file I/O for every sample. The codebase explicitly warns that `None` should almost never be used, even for very large image datasets, because the I/O overhead severely bottlenecks training.

Usage

Apply this heuristic when configuring `config.train.hdf5_cache_mode` for any training run. The choice depends on available system RAM relative to dataset size. Default is `"all"`, which is correct for low-dimensional observation datasets. Switch to `"low_dim"` for image datasets that exceed available RAM.

The Insight (Rule of Thumb)

  • Action: Set `config.train.hdf5_cache_mode` appropriately based on dataset type.
  • Value:
    • `"all"` — For low-dim datasets or when RAM >> dataset size (fastest).
    • `"low_dim"` — For image datasets that don't fit in RAM (caches state data, reads images from disk).
    • `None` — Never use this. Even for the largest datasets, caching low-dim data provides significant speedup.
  • Trade-off: `"all"` uses more RAM but provides maximum training throughput. `"low_dim"` saves RAM at the cost of I/O for image batches.
  • Companion setting: Enable `hdf5_use_swmr=True` (default) for safe multi-worker parallel access to the same HDF5 file.

Reasoning

HDF5 file I/O introduces significant latency per batch when data workers must seek and read from disk. For low-dimensional datasets (typically < 1GB), the entire file fits in RAM with minimal overhead. For image datasets (potentially 10-50GB), caching state vectors (a few MB) while streaming images from SSD provides the best RAM-to-throughput ratio. The framework enforces SWMR (Single-Writer-Multiple-Reader) mode by default to prevent file locking issues when multiple `DataLoader` workers access the same HDF5 file.

From `robomimic/config/base_config.py:166-170`:

# One of ["all", "low_dim", or None]. Set to "all" to cache entire hdf5 in memory - this is
# by far the fastest for data loading. Set to "low_dim" to cache all non-image data. Set
# to None to use no caching - in this case, every batch sample is retrieved via file i/o.
# You should almost never set this to None, even for large image datasets.
self.train.hdf5_cache_mode = "all"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment