Heuristic:ARISE Initiative Robomimic Video Recording Optimization
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Debugging |
| Last Updated | 2026-02-15 07:30 GMT |
Overview
Record evaluation rollout videos at 20 FPS with a `video_skip=5` frame skip to reduce file size while maintaining visual interpretability.
Description
During training, robomimic renders evaluation rollout videos for debugging and analysis. To prevent video recording from becoming an I/O bottleneck, the framework records only every 5th environment step (default `video_skip=5`) and writes at a fixed 20 FPS. By default, videos are only saved for checkpointed models (`keep_all_videos=False`), not every evaluation cycle. This approach reduces disk usage and rendering time while providing sufficient visual context to diagnose policy behavior.
Usage
Apply this heuristic when configuring video recording for training experiments. The defaults work well for most scenarios. Increase `video_skip` (e.g., to 10) for very long rollout horizons (700-1000 steps) to keep video files small. Set `keep_all_videos=True` only when debugging specific training dynamics across many epochs.
The Insight (Rule of Thumb)
- Action: Use the default video recording settings.
- Value:
- `video_skip = 5` — Record every 5th frame
- FPS = 20 — Playback speed
- `keep_all_videos = False` — Only save videos for checkpointed models
- `render_video = True` — Enable video recording
- Trade-off: Higher `video_skip` saves disk and rendering time but may miss fast transient behaviors. Lower skip gives smoother videos but increases storage.
Reasoning
From `robomimic/config/base_config.py:109-113`:
self.experiment.render = False
self.experiment.render_video = True
self.experiment.keep_all_videos = False
self.experiment.video_skip = 5
From `robomimic/utils/train_utils.py:455`:
video_writer = imageio.get_writer(video_path, fps=20)
At the default 400-step rollout horizon with `video_skip=5`, each video contains 80 frames at 20 FPS, resulting in a 4-second video clip per episode. For the longer TD3+BC horizon of 1000 steps, this produces 200 frames (10-second videos). These durations are sufficient for visual inspection without excessive storage overhead.