Implementation:ARISE Initiative Robomimic Dataset states to obs

Knowledge Sources	robomimic robomimic Dataset Contents
Domains	Robotics, Data_Pipeline, Simulation
Last Updated	2026-02-15 08:00 GMT

Overview

Concrete tool for extracting observations from raw simulation state datasets by replaying trajectories through the simulator environment provided by the robomimic scripts module.

Description

The dataset_states_to_obs function reads a raw HDF5 dataset containing simulation states and actions, creates an environment from the dataset's metadata, and replays each demonstration trajectory through the simulator to extract observations. It produces a new HDF5 file with obs, next_obs, rewards, dones, and actions for each demonstration.

Internally, it delegates per-trajectory extraction to extract_trajectory (at L64-167), which handles the state-by-state replay loop. The function also preserves filter keys from the original file and stores environment metadata.

Usage

Run as a CLI script after downloading raw datasets. Specify camera names for image observations or omit them for low-dim-only extraction.

Code Reference

Source Location

Repository: robomimic
File: robomimic/scripts/dataset_states_to_obs.py
Lines: L220-353 (dataset_states_to_obs), L64-167 (extract_trajectory)

Signature

def dataset_states_to_obs(args):
    """
    Extracts observations from a raw simulation state dataset.

    Args:
        args (argparse.Namespace): CLI arguments with attributes:
            dataset (str): path to input HDF5 file with simulation states
            output_name (str): output filename (auto-generated if None)
            n (int): number of demos to process (None for all)
            camera_names (list): camera names for image obs (empty for low_dim only)
            camera_height (int): image height (default: 84)
            camera_width (int): image width (default: 84)
            done_mode (int): done signal mode (0=success, 1=end, 2=both)
            copy_rewards (bool): copy rewards from source file
            copy_dones (bool): copy dones from source file
            shaped (bool): use shaped rewards
            compress (bool): use gzip compression for image data
            exclude_next_obs (bool): exclude next_obs to save space
            depth (bool): include depth observations
    """

Import

from robomimic.scripts.dataset_states_to_obs import dataset_states_to_obs, extract_trajectory

I/O Contract

Inputs

Name	Type	Required	Description
args.dataset	str	Yes	Path to input HDF5 file with data/demo_/states and data/demo_/actions
args.camera_names	list	No	Camera names for image obs. Empty list produces low_dim only
args.camera_height	int	No	Image height. Default: 84
args.camera_width	int	No	Image width. Default: 84
args.done_mode	int	No	Done signal mode: 0=success, 1=end, 2=both. Default: 0
args.n	int	No	Number of demos to process. Default: all

Outputs

Name	Type	Description
output HDF5	File	New HDF5 file with data/demo_/obs/, data/demo_/next_obs/, data/demo_/actions, data/demo_/rewards, data/demo_/dones, data/demo_/states, plus mask/ filter keys copied from input

Usage Examples

Low-Dimensional Observation Extraction

python robomimic/scripts/dataset_states_to_obs.py \
    --dataset /path/to/lift/ph/demo_v141.hdf5 \
    --done_mode 2
# Output: /path/to/lift/ph/demo_v141_ld.hdf5

Image Observation Extraction

python robomimic/scripts/dataset_states_to_obs.py \
    --dataset /path/to/lift/ph/demo_v141.hdf5 \
    --camera_names agentview robot0_eye_in_hand \
    --camera_height 84 --camera_width 84 \
    --done_mode 2 --compress
# Output: /path/to/lift/ph/demo_v141_im84.hdf5

Related Pages

Implements Principle

Principle:ARISE_Initiative_Robomimic_Observation_Extraction

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment