Implementation:ARISE Initiative Robomimic Dataset states to obs
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Data_Pipeline, Simulation |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for extracting observations from raw simulation state datasets by replaying trajectories through the simulator environment provided by the robomimic scripts module.
Description
The dataset_states_to_obs function reads a raw HDF5 dataset containing simulation states and actions, creates an environment from the dataset's metadata, and replays each demonstration trajectory through the simulator to extract observations. It produces a new HDF5 file with obs, next_obs, rewards, dones, and actions for each demonstration.
Internally, it delegates per-trajectory extraction to extract_trajectory (at L64-167), which handles the state-by-state replay loop. The function also preserves filter keys from the original file and stores environment metadata.
Usage
Run as a CLI script after downloading raw datasets. Specify camera names for image observations or omit them for low-dim-only extraction.
Code Reference
Source Location
- Repository: robomimic
- File: robomimic/scripts/dataset_states_to_obs.py
- Lines: L220-353 (dataset_states_to_obs), L64-167 (extract_trajectory)
Signature
def dataset_states_to_obs(args):
"""
Extracts observations from a raw simulation state dataset.
Args:
args (argparse.Namespace): CLI arguments with attributes:
dataset (str): path to input HDF5 file with simulation states
output_name (str): output filename (auto-generated if None)
n (int): number of demos to process (None for all)
camera_names (list): camera names for image obs (empty for low_dim only)
camera_height (int): image height (default: 84)
camera_width (int): image width (default: 84)
done_mode (int): done signal mode (0=success, 1=end, 2=both)
copy_rewards (bool): copy rewards from source file
copy_dones (bool): copy dones from source file
shaped (bool): use shaped rewards
compress (bool): use gzip compression for image data
exclude_next_obs (bool): exclude next_obs to save space
depth (bool): include depth observations
"""
Import
from robomimic.scripts.dataset_states_to_obs import dataset_states_to_obs, extract_trajectory
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| args.dataset | str | Yes | Path to input HDF5 file with data/demo_*/states and data/demo_*/actions |
| args.camera_names | list | No | Camera names for image obs. Empty list produces low_dim only |
| args.camera_height | int | No | Image height. Default: 84 |
| args.camera_width | int | No | Image width. Default: 84 |
| args.done_mode | int | No | Done signal mode: 0=success, 1=end, 2=both. Default: 0 |
| args.n | int | No | Number of demos to process. Default: all |
Outputs
| Name | Type | Description |
|---|---|---|
| output HDF5 | File | New HDF5 file with data/demo_*/obs/*, data/demo_*/next_obs/*, data/demo_*/actions, data/demo_*/rewards, data/demo_*/dones, data/demo_*/states, plus mask/ filter keys copied from input |
Usage Examples
Low-Dimensional Observation Extraction
python robomimic/scripts/dataset_states_to_obs.py \
--dataset /path/to/lift/ph/demo_v141.hdf5 \
--done_mode 2
# Output: /path/to/lift/ph/demo_v141_ld.hdf5
Image Observation Extraction
python robomimic/scripts/dataset_states_to_obs.py \
--dataset /path/to/lift/ph/demo_v141.hdf5 \
--camera_names agentview robot0_eye_in_hand \
--camera_height 84 --camera_width 84 \
--done_mode 2 --compress
# Output: /path/to/lift/ph/demo_v141_im84.hdf5