Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:OpenGVLab InternVL Pretrain Main

From Leeroopedia


Knowledge Sources
Domains Pretraining, Training
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for executing multi-stage pretraining of InternVL models provided by the InternVL training framework.

Description

The internvl_chat_pretrain.py script implements all three pretraining stages. It differs from the finetune script by:

  • Supporting Path B model loading (separate vision + LLM components)
  • Implementing rank-aware data distribution optimized for large-scale training
  • Always using packed training mode for efficient GPU utilization
  • Supporting stage-specific freeze configurations via command-line arguments

The same script is invoked for all three stages with different configuration arguments.

Usage

Launched three times sequentially with different shell scripts for Stage 1, 1.5, and 2. Each stage loads the previous stage's output checkpoint.

Code Reference

Source Location

  • Repository: InternVL
  • File: internvl_chat/internvl/train/internvl_chat_pretrain.py
  • Lines: L842-1116

Signature

# Stage 1: MLP warmup
python internvl_chat_pretrain.py \
    --vision_path ./pretrained/InternViT-300M \
    --llm_path ./pretrained/internlm2_5-7b-chat \
    --freeze_backbone True --freeze_llm True --freeze_mlp False \
    --drop_path_rate 0.0 --learning_rate 2e-4 --max_steps 100000

# Stage 1.5: ViT incremental
python internvl_chat_pretrain.py \
    --model_name_or_path ./output/stage1 \
    --freeze_backbone False --freeze_llm True --freeze_mlp False \
    --drop_path_rate 0.1 --learning_rate 1e-5 --max_steps 100000

# Stage 2: Full instruction tuning
python internvl_chat_pretrain.py \
    --model_name_or_path ./output/stage1_5 \
    --freeze_backbone False --freeze_llm False --freeze_mlp False \
    --drop_path_rate 0.1 --learning_rate 4e-5 --max_steps 5500

Import

cd internvl_chat
# Launched via deepspeed or torchrun with stage-specific shell scripts
bash shell/internvl2.5/stage1/internvl2_5_8b_internlm2_5_7b_dynamic_res_stage1.sh

I/O Contract

Inputs

Name Type Required Description
--vision_path str Stage 1 only Path to pretrained InternViT
--llm_path str Stage 1 only Path to pretrained LLM
--model_name_or_path str Stages 1.5, 2 Previous stage output checkpoint
--freeze_backbone/llm/mlp bool Yes Per-stage freeze configuration

Outputs

Name Type Description
Checkpoint Directory Trained model checkpoint for the next stage or deployment

Usage Examples

Complete 3-Stage Pipeline

cd internvl_chat

# Stage 1: MLP warmup (512 GPUs, ~100K steps)
bash shell/internvl2.5/stage1/internvl2_5_8b_internlm2_5_7b_dynamic_res_stage1.sh

# Stage 1.5: ViT incremental (512 GPUs, ~100K steps)
bash shell/internvl2.5/stage1.5/internvl2_5_8b_internlm2_5_7b_dynamic_res_stage1_5.sh

# Stage 2: Instruction tuning (512 GPUs, ~5.5K steps)
bash shell/internvl2.5/stage2/internvl2_5_8b_internlm2_5_7b_dynamic_res_stage2.sh

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment