Principle:ARISE Initiative Robomimic Training Loop Execution

Knowledge Sources	robomimic robomimic robomimic Getting Started
Domains	Robotics, Training, Optimization
Last Updated	2026-02-15 08:00 GMT

Overview

An epoch-based training loop pattern that iterates over batched demonstration data, performing forward and backward passes through an algorithm's networks while collecting training metrics.

Description

Training Loop Execution is the core optimization step in offline robot learning. It follows the standard supervised learning paradigm: iterate over mini-batches from a DataLoader, compute loss, and update model parameters. However, it abstracts the specific loss computation and parameter update logic into the algorithm's train_on_batch method, allowing the same loop to work with diverse algorithm families.

The training loop handles:

Batch processing: Loads batches from the DataLoader, handles epoch boundaries with iterator reset
Observation normalization: Optionally applies observation normalization statistics before training
Algorithm-agnostic interface: Calls model.process_batch_for_training, model.train_on_batch, and model.log_info without knowing algorithm specifics
Validation mode: The same loop supports validation by disabling gradient computation
Fixed-step epochs: Optionally limits the number of gradient steps per epoch (useful when datasets are very large)
Timing statistics: Tracks time spent in data loading, batch processing, training, and logging

Usage

Use this principle during the inner loop of training, called once per epoch (for both training and validation). It requires a fully instantiated algorithm and a DataLoader wrapping a SequenceDataset.

Theoretical Basis

The training loop implements the standard mini-batch stochastic gradient descent pattern adapted for offline learning:

# Abstract training loop (not real implementation)
for step in range(num_steps):
    batch = next(data_loader_iter)

    # Algorithm-specific batch preprocessing
    input_batch = model.process_batch_for_training(batch)
    input_batch = model.postprocess_batch_for_training(input_batch, obs_normalization_stats)

    # Forward + backward + optimizer step (all encapsulated)
    info = model.train_on_batch(input_batch, epoch, validate=False)

    # Learning rate scheduling
    model.on_gradient_step()

    # Collect metrics
    step_log = model.log_info(info)
    all_logs.append(step_log)

# Return averaged metrics
return {k: mean(v) for k, v in aggregate(all_logs)}

The key design decision is encapsulating the loss function and optimizer step inside train_on_batch, making the outer loop algorithm-agnostic.

Related Pages

Implemented By

Implementation:ARISE_Initiative_Robomimic_TrainUtils_run_epoch

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment