Implementation:ARISE Initiative Robosuite DataCollectionWrapper
Metadata:
- robosuite
- Imitation_Learning
- Data_Engineering
- Last Updated: 2026-02-15 12:00 GMT
Overview
Concrete wrapper for recording simulation states and actions to disk provided by the robosuite wrappers module.
Description
DataCollectionWrapper wraps a MujocoEnv to automatically save flattened MuJoCo states and actions to a specified directory. Each episode creates a subdirectory with state_t.npz files and the environment's model XML. Data is flushed to disk periodically based on flush_freq. On reset(), previous episode data is finalized and a new episode directory is created.
Usage
Wrap an environment after VisualizationWrapper (if used) and before running teleoperation collection loops.
Code Reference
Source: robosuite
File: robosuite/wrappers/data_collection_wrapper.py
Lines: L16-216
Signature:
class DataCollectionWrapper:
def __init__(self, env, directory, collect_freq=1, flush_freq=100, use_env_xml_for_reset=False):
"""
Args:
env (MujocoEnv): The environment to monitor
directory (str): Where to store collected data
collect_freq (int): How often to save sim state in env steps
flush_freq (int): How frequently to dump data to disk
use_env_xml_for_reset (bool): XML source for reset
"""
Import:
from robosuite.wrappers import DataCollectionWrapper
I/O Contract
Inputs:
- env (MujocoEnv, Required): The environment to monitor
- directory (str, Required): Where to store collected data
- collect_freq (int, Optional, default 1): How often to save sim state in env steps
- flush_freq (int, Optional, default 100): How frequently to dump data to disk
- use_env_xml_for_reset (bool, Optional, default False): XML source for reset
Outputs:
- DataCollectionWrapper instance. step() records state/action. reset() finalizes episode. Data saved as .npz files per timestep.
Usage Examples
Basic wrapping pattern:
import robosuite
from robosuite.wrappers import DataCollectionWrapper, VisualizationWrapper
# Create base environment
env = robosuite.make(
"Lift",
robots="Panda",
has_renderer=True,
has_offscreen_renderer=False,
use_camera_obs=False,
)
# Wrap with visualization (optional) then data collection
env = VisualizationWrapper(env)
env = DataCollectionWrapper(env, "/path/to/data")
# Normal interaction loop - data is recorded automatically
for episode in range(10):
obs = env.reset()
for step in range(100):
action = policy.get_action(obs)
obs, reward, done, info = env.step(action)
if done:
break
Custom collection frequency:
# Collect every 5 steps, flush every 50 steps
env = DataCollectionWrapper(
env,
directory="/path/to/data",
collect_freq=5,
flush_freq=50
)