Implementation:ARISE Initiative Robomimic TrainUtils rollout with stats
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Evaluation, Simulation |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
Concrete tool for conducting multi-environment rollout evaluation of trained policies with statistics aggregation and video recording provided by the robomimic training utilities module.
Description
The rollout_with_stats function evaluates a RolloutPolicy across multiple environments, running a specified number of episodes in each. It handles video writer creation for per-environment or shared video output, delegates individual rollouts to run_rollout, and aggregates per-episode metrics into averaged statistics.
Each single rollout (via run_rollout at L275-387) implements the closed-loop policy-environment interaction: reset, step, record, check success. The outer function handles video management, iteration, and metric aggregation.
Usage
Call this function periodically during training (e.g., every N epochs) or for final model evaluation. Requires a trained policy wrapped as RolloutPolicy and a dictionary of environment instances.
Code Reference
Source Location
- Repository: robomimic
- File: robomimic/utils/train_utils.py
- Lines: L390-515 (rollout_with_stats), L275-387 (run_rollout)
Signature
def rollout_with_stats(
policy,
envs,
horizon,
use_goals=False,
num_episodes=None,
render=False,
video_dir=None,
video_path=None,
epoch=None,
video_skip=5,
terminate_on_success=False,
verbose=False,
):
"""
Conduct evaluation rollouts per environment and summarize the results.
Args:
policy (RolloutPolicy instance): policy to use for rollouts
envs (dict): maps env_name (str) to EnvBase instance
horizon (int): maximum number of steps per rollout
use_goals (bool): if True, provide goal observations from env
num_episodes (int): number of rollout episodes per environment
render (bool): if True, render to screen
video_dir (str): dump rollout videos to this directory (one per env)
video_path (str): dump a single rollout video for all environments
epoch (int): epoch number (used for video naming)
video_skip (int): how often to write video frame
terminate_on_success (bool): if True, terminate episode on success
verbose (bool): if True, print per-episode results
Returns:
all_rollout_logs (OrderedDict): averaged rollout statistics per env
video_paths (OrderedDict): path to rollout videos per env
"""
Import
import robomimic.utils.train_utils as TrainUtils
# Call as:
all_rollout_logs, video_paths = TrainUtils.rollout_with_stats(
policy=rollout_policy, envs=envs, horizon=horizon, num_episodes=50
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| policy | RolloutPolicy | Yes | Trained policy wrapper for inference |
| envs | dict | Yes | Maps environment names to EnvBase instances |
| horizon | int | Yes | Maximum timesteps per episode |
| use_goals | bool | No | Goal-conditioned evaluation. Default: False |
| num_episodes | int | No | Number of episodes per environment |
| render | bool | No | On-screen rendering. Default: False |
| video_dir | str | No | Directory for per-environment video output |
| video_path | str | No | Single video path for all environments |
| epoch | int | No | Epoch number for video naming |
| video_skip | int | No | Frame skip for video recording. Default: 5 |
| terminate_on_success | bool | No | Early termination on success. Default: False |
| verbose | bool | No | Print per-episode results. Default: False |
Outputs
| Name | Type | Description |
|---|---|---|
| all_rollout_logs | OrderedDict | Per-environment averaged statistics: Return, Horizon, Success_Rate, Time_Episode (minutes) |
| video_paths | OrderedDict | Maps environment names to video file paths |
Usage Examples
Training Evaluation
import robomimic.utils.train_utils as TrainUtils
from robomimic.algo import RolloutPolicy
# Wrap trained model as rollout policy
rollout_policy = RolloutPolicy(model, obs_normalization_stats=obs_normalization_stats)
# Evaluate across environments
all_rollout_logs, video_paths = TrainUtils.rollout_with_stats(
policy=rollout_policy,
envs=envs,
horizon=config.experiment.rollout.horizon,
use_goals=config.use_goals,
num_episodes=config.experiment.rollout.n,
render=False,
video_dir=video_dir,
epoch=epoch,
video_skip=5,
terminate_on_success=config.experiment.rollout.terminate_on_success,
)
# Log results
for env_name, logs in all_rollout_logs.items():
print(f"{env_name}: Success={logs['Success_Rate']:.2f}, Return={logs['Return']:.2f}")
Related Pages
Implements Principle
Requires Environment
- Environment:ARISE_Initiative_Robomimic_PyTorch_CUDA_Environment
- Environment:ARISE_Initiative_Robomimic_Robosuite_Simulation_Backend