Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:OpenGVLab InternVL Supervised Training Loop

From Leeroopedia


Knowledge Sources
Domains Training, Deep_Learning, Distributed_Computing
Last Updated 2026-02-07 00:00 GMT

Overview

A managed training loop that handles gradient computation, optimization, distributed training, checkpointing, and logging for supervised fine-tuning of vision-language models.

Description

The supervised training loop abstracts the boilerplate of training large models in distributed settings. Rather than writing a custom PyTorch training loop, the framework delegates to HuggingFace's Trainer class which provides:

  • Gradient accumulation: Simulates larger batch sizes across multiple forward passes
  • Distributed training: Integration with DeepSpeed ZeRO for memory-efficient multi-GPU training
  • Mixed precision: BF16/FP16 training for reduced memory and faster computation
  • Checkpointing: Periodic model saving with configurable strategies
  • Logging: Training metrics tracked via TensorBoard or Weights & Biases
  • Resume from checkpoint: Seamless training continuation after interruptions

The training loop operates on data produced by the data collator, which batches and pads variable-length multimodal sequences.

Usage

Use this principle when performing supervised fine-tuning (full parameter or LoRA) on InternVL models. The Trainer handles all aspects of the training loop; the user only needs to configure the model, dataset, and training arguments.

Theoretical Basis

The supervised training objective minimizes cross-entropy loss on the assistant's response tokens:

SFT=t𝟙[tassistant]logpθ(xt|x<t)

Human turn tokens and image tokens are masked (label = -100) and excluded from loss computation.

The training loop with DeepSpeed ZeRO:

# Pseudo-code: Managed training loop
for batch in dataloader:
    # Forward pass with mixed precision
    with autocast(bf16=True):
        loss = model(input_ids, labels, pixel_values, image_flags).loss

    # Backward pass with gradient accumulation
    loss = loss / gradient_accumulation_steps
    loss.backward()

    if step % gradient_accumulation_steps == 0:
        optimizer.step()
        scheduler.step()
        optimizer.zero_grad()

    # Periodic checkpointing
    if step % save_steps == 0:
        save_checkpoint()

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment