Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:ARISE Initiative Robomimic Observation Extraction

From Leeroopedia
Knowledge Sources
Domains Robotics, Data_Pipeline, Simulation
Last Updated 2026-02-15 08:00 GMT

Overview

A simulation replay technique that reconstructs observation signals (images, proprioception, rewards, done flags) from recorded simulation states by replaying trajectories through the simulator environment.

Description

Observation Extraction transforms raw demonstration datasets (containing only simulator states and actions) into training-ready datasets with full observation modalities. The raw datasets store compact simulation states (joint positions, object poses) which are sufficient to reconstruct any observation via simulation replay.

This design offers two key advantages:

  • Storage efficiency: Raw state files are much smaller than files with rendered images
  • Flexibility: Different observation modalities (cameras, resolutions, depth) can be extracted without re-collecting demonstrations

The extraction process replays each demonstration trajectory through the simulator: for each timestep, it sets the simulator state, renders observations (low-dim proprioception, RGB images from specified cameras, depth maps), computes rewards, and infers done signals. The resulting observations are written to a new HDF5 file.

Usage

Use this principle after downloading raw datasets and before training. It converts state-based datasets to observation-based datasets suitable for the SequenceDataset loader. The extraction can produce low-dimensional-only or image-based datasets depending on the specified camera names.

Theoretical Basis

The extraction follows a state replay pattern:

# Abstract state replay pattern (not real implementation)
for each_demo in raw_dataset:
    states = demo["states"]      # Array of sim states
    actions = demo["actions"]    # Array of actions
    env.reset_to(states[0])      # Load initial state

    for t in range(len(states)):
        obs = env.get_observation()     # Render current state
        next_obs = env.reset_to(states[t+1])  # or env.step(actions[t])
        reward = env.get_reward()
        done = infer_done(t, states, env)

        save(obs, next_obs, actions[t], reward, done)

The done signal can be configured in three modes:

  • Mode 0: done=1 when state is a success state
  • Mode 1: done=1 at the end of each trajectory
  • Mode 2: both conditions

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment