Principle:ARISE Initiative Robomimic Observation Extraction
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Data_Pipeline, Simulation |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
A simulation replay technique that reconstructs observation signals (images, proprioception, rewards, done flags) from recorded simulation states by replaying trajectories through the simulator environment.
Description
Observation Extraction transforms raw demonstration datasets (containing only simulator states and actions) into training-ready datasets with full observation modalities. The raw datasets store compact simulation states (joint positions, object poses) which are sufficient to reconstruct any observation via simulation replay.
This design offers two key advantages:
- Storage efficiency: Raw state files are much smaller than files with rendered images
- Flexibility: Different observation modalities (cameras, resolutions, depth) can be extracted without re-collecting demonstrations
The extraction process replays each demonstration trajectory through the simulator: for each timestep, it sets the simulator state, renders observations (low-dim proprioception, RGB images from specified cameras, depth maps), computes rewards, and infers done signals. The resulting observations are written to a new HDF5 file.
Usage
Use this principle after downloading raw datasets and before training. It converts state-based datasets to observation-based datasets suitable for the SequenceDataset loader. The extraction can produce low-dimensional-only or image-based datasets depending on the specified camera names.
Theoretical Basis
The extraction follows a state replay pattern:
# Abstract state replay pattern (not real implementation)
for each_demo in raw_dataset:
states = demo["states"] # Array of sim states
actions = demo["actions"] # Array of actions
env.reset_to(states[0]) # Load initial state
for t in range(len(states)):
obs = env.get_observation() # Render current state
next_obs = env.reset_to(states[t+1]) # or env.step(actions[t])
reward = env.get_reward()
done = infer_done(t, states, env)
save(obs, next_obs, actions[t], reward, done)
The done signal can be configured in three modes:
- Mode 0: done=1 when state is a success state
- Mode 1: done=1 at the end of each trajectory
- Mode 2: both conditions