Implementation:ARISE Initiative Robomimic TrainUtils rollout with stats

Knowledge Sources	robomimic robomimic Getting Started
Domains	Robotics, Evaluation, Simulation
Last Updated	2026-02-15 08:00 GMT

Overview

Concrete tool for conducting multi-environment rollout evaluation of trained policies with statistics aggregation and video recording provided by the robomimic training utilities module.

Description

The rollout_with_stats function evaluates a RolloutPolicy across multiple environments, running a specified number of episodes in each. It handles video writer creation for per-environment or shared video output, delegates individual rollouts to run_rollout, and aggregates per-episode metrics into averaged statistics.

Each single rollout (via run_rollout at L275-387) implements the closed-loop policy-environment interaction: reset, step, record, check success. The outer function handles video management, iteration, and metric aggregation.

Usage

Call this function periodically during training (e.g., every N epochs) or for final model evaluation. Requires a trained policy wrapped as RolloutPolicy and a dictionary of environment instances.

Code Reference

Source Location

Repository: robomimic
File: robomimic/utils/train_utils.py
Lines: L390-515 (rollout_with_stats), L275-387 (run_rollout)

Signature

def rollout_with_stats(
    policy,
    envs,
    horizon,
    use_goals=False,
    num_episodes=None,
    render=False,
    video_dir=None,
    video_path=None,
    epoch=None,
    video_skip=5,
    terminate_on_success=False,
    verbose=False,
):
    """
    Conduct evaluation rollouts per environment and summarize the results.

    Args:
        policy (RolloutPolicy instance): policy to use for rollouts
        envs (dict): maps env_name (str) to EnvBase instance
        horizon (int): maximum number of steps per rollout
        use_goals (bool): if True, provide goal observations from env
        num_episodes (int): number of rollout episodes per environment
        render (bool): if True, render to screen
        video_dir (str): dump rollout videos to this directory (one per env)
        video_path (str): dump a single rollout video for all environments
        epoch (int): epoch number (used for video naming)
        video_skip (int): how often to write video frame
        terminate_on_success (bool): if True, terminate episode on success
        verbose (bool): if True, print per-episode results

    Returns:
        all_rollout_logs (OrderedDict): averaged rollout statistics per env
        video_paths (OrderedDict): path to rollout videos per env
    """

Import

import robomimic.utils.train_utils as TrainUtils

# Call as:
all_rollout_logs, video_paths = TrainUtils.rollout_with_stats(
    policy=rollout_policy, envs=envs, horizon=horizon, num_episodes=50
)

I/O Contract

Inputs

Name	Type	Required	Description
policy	RolloutPolicy	Yes	Trained policy wrapper for inference
envs	dict	Yes	Maps environment names to EnvBase instances
horizon	int	Yes	Maximum timesteps per episode
use_goals	bool	No	Goal-conditioned evaluation. Default: False
num_episodes	int	No	Number of episodes per environment
render	bool	No	On-screen rendering. Default: False
video_dir	str	No	Directory for per-environment video output
video_path	str	No	Single video path for all environments
epoch	int	No	Epoch number for video naming
video_skip	int	No	Frame skip for video recording. Default: 5
terminate_on_success	bool	No	Early termination on success. Default: False
verbose	bool	No	Print per-episode results. Default: False

Outputs

Name	Type	Description
all_rollout_logs	OrderedDict	Per-environment averaged statistics: Return, Horizon, Success_Rate, Time_Episode (minutes)
video_paths	OrderedDict	Maps environment names to video file paths

Usage Examples

Training Evaluation

import robomimic.utils.train_utils as TrainUtils
from robomimic.algo import RolloutPolicy

# Wrap trained model as rollout policy
rollout_policy = RolloutPolicy(model, obs_normalization_stats=obs_normalization_stats)

# Evaluate across environments
all_rollout_logs, video_paths = TrainUtils.rollout_with_stats(
    policy=rollout_policy,
    envs=envs,
    horizon=config.experiment.rollout.horizon,
    use_goals=config.use_goals,
    num_episodes=config.experiment.rollout.n,
    render=False,
    video_dir=video_dir,
    epoch=epoch,
    video_skip=5,
    terminate_on_success=config.experiment.rollout.terminate_on_success,
)

# Log results
for env_name, logs in all_rollout_logs.items():
    print(f"{env_name}: Success={logs['Success_Rate']:.2f}, Return={logs['Return']:.2f}")

Related Pages

Implements Principle

Principle:ARISE_Initiative_Robomimic_Rollout_Evaluation

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment