Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:ARISE Initiative Robomimic Training Loop Execution

From Leeroopedia
Knowledge Sources
Domains Robotics, Training, Optimization
Last Updated 2026-02-15 08:00 GMT

Overview

An epoch-based training loop pattern that iterates over batched demonstration data, performing forward and backward passes through an algorithm's networks while collecting training metrics.

Description

Training Loop Execution is the core optimization step in offline robot learning. It follows the standard supervised learning paradigm: iterate over mini-batches from a DataLoader, compute loss, and update model parameters. However, it abstracts the specific loss computation and parameter update logic into the algorithm's train_on_batch method, allowing the same loop to work with diverse algorithm families.

The training loop handles:

  • Batch processing: Loads batches from the DataLoader, handles epoch boundaries with iterator reset
  • Observation normalization: Optionally applies observation normalization statistics before training
  • Algorithm-agnostic interface: Calls model.process_batch_for_training, model.train_on_batch, and model.log_info without knowing algorithm specifics
  • Validation mode: The same loop supports validation by disabling gradient computation
  • Fixed-step epochs: Optionally limits the number of gradient steps per epoch (useful when datasets are very large)
  • Timing statistics: Tracks time spent in data loading, batch processing, training, and logging

Usage

Use this principle during the inner loop of training, called once per epoch (for both training and validation). It requires a fully instantiated algorithm and a DataLoader wrapping a SequenceDataset.

Theoretical Basis

The training loop implements the standard mini-batch stochastic gradient descent pattern adapted for offline learning:

# Abstract training loop (not real implementation)
for step in range(num_steps):
    batch = next(data_loader_iter)

    # Algorithm-specific batch preprocessing
    input_batch = model.process_batch_for_training(batch)
    input_batch = model.postprocess_batch_for_training(input_batch, obs_normalization_stats)

    # Forward + backward + optimizer step (all encapsulated)
    info = model.train_on_batch(input_batch, epoch, validate=False)

    # Learning rate scheduling
    model.on_gradient_step()

    # Collect metrics
    step_log = model.log_info(info)
    all_logs.append(step_log)

# Return averaged metrics
return {k: mean(v) for k, v in aggregate(all_logs)}

The key design decision is encapsulating the loss function and optimizer step inside train_on_batch, making the outer loop algorithm-agnostic.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment