Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:OpenGVLab InternVL Correctness Build Data

From Leeroopedia


Knowledge Sources
Domains Alignment, Data_Engineering
Last Updated 2026-02-07 00:00 GMT

Overview

Concrete tool for constructing correctness-based preference data via shell script pipelines provided by the InternVL MPO data construction framework.

Description

The correctness_build_data.sh script orchestrates the preference data construction pipeline. It runs mmpr_data_pipeline_correctness_postprocess.py across multiple tile configurations to generate MMPR-format preference pairs from model-generated reasoning samples.

A companion visualprm_build_data.sh constructs VisualPRM data using Monte Carlo step-level reward estimation.

Usage

Run these scripts before MPO training to generate the preference dataset. They require pre-generated model sampling outputs.

Code Reference

Source Location

  • Repository: InternVL
  • File: internvl_chat/shell/internvl3.0/mpo_data_construction/correctness_build_data.sh
  • Lines: L1-34

Signature

# correctness_build_data.sh
# Iterates over tile configurations and runs postprocessing

for MAX_NUM in 1 6 12 18 24; do
    python tools/reasoning_data_pipeline/mmpr_data_pipeline_correctness_postprocess.py \
        --num-pairs-per-key 15 \
        --max-lines 1200000 \
        --answer-fix \
        --max-num $MAX_NUM \
        --input-dir $INPUT_DIR \
        --output-dir $OUTPUT_DIR
done
# visualprm_build_data.sh
for MAX_NUM in 1 6 12 18 24; do
    python tools/reasoning_data_pipeline/visualprm_data_pipeline_postprocess.py \
        --mc-threshold 0.0 \
        --max-num $MAX_NUM \
        --input-dir $INPUT_DIR \
        --output-dir $OUTPUT_DIR
done

Import

cd internvl_chat
bash shell/internvl3.0/mpo_data_construction/correctness_build_data.sh
bash shell/internvl3.0/visualprm_data_construction/visualprm_build_data.sh

I/O Contract

Inputs

Name Type Required Description
Model sampling outputs Directory Yes Pre-generated reasoning samples from the model
--num-pairs-per-key int No Max preference pairs per question (default 15)
--max-lines int No Per-tile-config sample limit (default 1200000)
--mc-threshold float No Monte Carlo threshold for VisualPRM (default 0.0)

Outputs

Name Type Description
MMPR JSONL Files Preference data with question, chosen, rejected fields and associated images

Usage Examples

Build Correctness Preference Data

cd internvl_chat

# Step 1: Generate model samples (prerequisite)
bash shell/internvl3.0/mpo_data_construction/correctness_mmpr_8b.sh

# Step 2: Build preference pairs from samples
bash shell/internvl3.0/mpo_data_construction/correctness_build_data.sh

# Step 3: Build VisualPRM data
bash shell/internvl3.0/visualprm_data_construction/visualprm_build_data.sh

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment