Workflow:ARISE Initiative Robosuite Human Demonstration Collection

Knowledge Sources	robosuite robosuite Docs Demonstrations Guide
Domains	Robotics, Imitation_Learning, Data_Collection
Last Updated	2026-02-15 06:00 GMT

Overview

End-to-end pipeline for collecting human demonstrations via teleoperation, recording trajectories as state-action pairs, aggregating them into HDF5 datasets, and playing them back for verification.

Description

This workflow implements the complete human demonstration collection pipeline used to create training datasets for imitation learning. A human operator teleoperates the robot to complete manipulation tasks, while the DataCollectionWrapper records simulation states, actions, and model XML at each timestep. Successful demonstrations are filtered and aggregated into a structured HDF5 file. The resulting dataset can be replayed for verification using either state-loading or action-playback modes. This pipeline is designed for use with the robomimic framework for downstream imitation learning.

Usage

Execute this workflow when you need to create a dataset of expert demonstrations for imitation learning, behavioral cloning, or demonstration-augmented reinforcement learning. The collected HDF5 files are compatible with the robomimic framework and the DemoSamplerWrapper for curriculum RL from demonstration states.

Execution Steps

Step 1: Configure Environment And Controller

Select the manipulation task, robot, and controller configuration. Load the composite controller config and create the environment with rendering enabled and reward shaping active. The environment is configured identically to how the learned policy will be evaluated to ensure demonstration compatibility.

Key considerations:

Controller configuration is stored in the HDF5 metadata for reproducibility
Reward shaping should be enabled so demonstrations capture informative reward signals
Camera views can be specified for multi-view recording

Step 2: Wrap Environment For Data Collection

Apply the VisualizationWrapper for operator feedback, then wrap with the DataCollectionWrapper. The data collection wrapper intercepts each `step()` call to record the current simulation state (flattened MuJoCo state vector), the action taken, and the model XML. Data is saved to temporary NPZ files organized by episode.

Key considerations:

The DataCollectionWrapper saves states AFTER each action, resulting in one extra state
Each episode creates a separate directory with `state_*.npz` files and `model.xml`
The wrapper records environment name, states, actions, and success flags

Step 3: Initialize Teleoperation Device

Create and configure the input device (Keyboard, SpaceMouse, DualSense, or MJGUI). Set position and rotation sensitivity. Register device callbacks with the viewer as needed.

Key considerations:

Device choice affects demonstration quality (SpaceMouse provides smoother trajectories)
For mobile base robots, use `goal_update_mode='achieved'` with `input_ref_frame='base'`

Step 4: Collect Demonstration Trajectories

Execute the teleoperation loop to collect demonstrations. The operator controls the robot to complete the task. A success detector monitors task completion and latches for 10 consecutive successful timesteps before ending the episode. The operator can also manually reset to start a new attempt.

Key considerations:

`env._check_success()` determines task completion based on task-specific criteria
10 consecutive success timesteps are required to confirm completion
Failed attempts are also recorded but filtered out during aggregation
Multiple demonstrations can be collected in sequence

Step 5: Aggregate Into HDF5 Dataset

After collecting demonstrations, the `gather_demonstrations_as_hdf5` function scans the temporary directory, loads NPZ files for each episode, filters for successful demonstrations only, and writes them into a structured HDF5 file. The file contains metadata (date, time, version, environment info) and per-demonstration groups with states, actions, and model XML.

Key considerations:

Only successful demonstrations are included in the final dataset
The last state is removed to align state and action counts
Environment configuration JSON is stored as a dataset attribute for reproducibility
The HDF5 structure follows the robomimic convention

Step 6: Playback And Verify Demonstrations

Replay demonstrations from the HDF5 file using either state-loading mode (directly set MuJoCo states) or action-playback mode (replay actions through the simulator). Action playback verifies determinism by comparing replayed states against recorded states and reports any divergence.

Key considerations:

State-loading mode is faster but does not verify action determinism
Action-playback mode confirms that recorded actions reproduce the trajectory
The model XML from the demonstration is used to reconstruct the exact scene

Execution Diagram

GitHub URL

Workflow Repository