Principle:ARISE Initiative Robosuite Data Collection Wrapping
Metadata:
- robosuite
- Imitation_Learning
- Data_Engineering
- Last Updated: 2026-02-15 12:00 GMT
Overview
Decorator pattern for transparently recording simulation states and actions during environment interaction for demonstration dataset collection.
Description
The Data Collection Wrapping principle extends the environment wrapper pattern to automatically record simulation data during interaction. On each step(), the wrapper saves the flattened MuJoCo state and the action taken. Data is written to disk as individual .npz files per timestep, organized into episode directories. This non-invasive recording approach means the collection process is identical to normal environment interaction.
Usage
Use when collecting human demonstrations or recording any policy rollouts for imitation learning datasets.
Theoretical Basis
The transparent recording pattern leverages the environment wrapper design to intercept and record data without modifying the core interaction loop. This approach ensures that data collection does not introduce artifacts or change the behavior of the underlying environment.
MuJoCo State Representation: MuJoCo states are saved as flattened arrays that capture the complete simulation state, including joint positions, velocities, and contact information. This flattened representation enables exact state replay and ensures that the full dynamical state of the simulation can be reconstructed from the saved data.
Recording Flow Pseudocode:
class DataCollectionWrapper:
def step(action):
# Save current state before stepping
if should_collect():
state = env.sim.get_state().flatten()
save_to_buffer(state, action, timestep)
# Execute normal environment step
obs, reward, done, info = env.step(action)
# Flush to disk periodically
if timestep % flush_freq == 0:
write_buffer_to_disk()
return obs, reward, done, info
def reset():
# Finalize current episode data
write_buffer_to_disk()
create_new_episode_directory()
# Reset environment
return env.reset()
The wrapper maintains a buffer of state-action pairs that is periodically flushed to disk, balancing I/O efficiency with data safety. Each episode is stored in a separate directory with sequentially numbered state files and the environment's model XML for reproducibility.