Principle:ARISE Initiative Robomimic Rollout Execution Eval
| Knowledge Sources | |
|---|---|
| Domains | Robotics, Evaluation, Simulation |
| Last Updated | 2026-02-15 08:00 GMT |
Overview
A detailed rollout execution pattern for post-training policy evaluation that captures full trajectory data (states, actions, rewards, observations) and supports video recording and trajectory export.
Description
Rollout Execution for Evaluation extends the basic rollout concept from the training workflow with additional data capture capabilities. While training rollouts (via rollout_with_stats) focus on aggregate metrics, evaluation rollouts capture the complete trajectory including simulation states, actions, observations, and optionally video. This detailed data enables:
- Quantitative analysis: Compute success rate, return, and task-specific metrics
- Qualitative inspection: Record videos from multiple camera viewpoints
- Dataset generation: Export trajectories to HDF5 for further analysis or fine-tuning
- Reproducibility: Save initial states and simulation states for exact replay
This principle differs from the training rollout (Rollout_Evaluation) in that it returns the full trajectory dictionary rather than just aggregate statistics, and supports multi-camera video concatenation.
Usage
Use this principle in standalone evaluation scripts (run_trained_agent.py) after loading a checkpoint and reconstructing the environment. It is the core evaluation function for post-training analysis.
Theoretical Basis
# Abstract evaluation rollout (not real implementation)
def rollout(policy, env, horizon):
obs = env.reset()
state = env.get_state()
trajectory = {"actions": [], "rewards": [], "states": [], "initial_state": state}
for t in range(horizon):
action = policy(obs)
next_obs, reward, done, info = env.step(action)
trajectory["actions"].append(action)
trajectory["rewards"].append(reward)
trajectory["states"].append(env.get_state()["states"])
if done or env.is_success():
break
obs = next_obs
stats = {"Return": sum(trajectory["rewards"]),
"Success_Rate": float(env.is_success())}
return stats, trajectory