Implementation:OpenGVLab InternVL MMBench VQA Inference

Knowledge Sources	OpenGVLab_InternVL
Domains	Inference, Benchmark, Multiple_Choice
Last Updated	2026-02-07 14:00 GMT

Overview

This script generates model predictions for the MMBench multiple-choice benchmark, handling its TSV/pandas data format with base64-encoded images and option rotation.

Description

The model_vqa_mmbench.py script implements the inference pipeline specifically tailored for the MMBench benchmark. It handles MMBench's unique data format:

TSV input: Questions are loaded via pd.read_table rather than JSONL, with columns for index, question, hint, image (base64-encoded), and option columns A-D
Base64 image decoding: Images are decoded from base64 strings using load_image_from_base64
Option construction: The get_options function extracts valid options (A-D) stopping at the first None/NaN value, and is_none handles various null representations

The inference loop supports multi-round evaluation with option rotation: when --all-rounds is enabled, the script cycles through all permutations of option ordering to assess position bias. Each round rotates both the option text and option letters.

Additional features include:

Single prediction prompt mode (--single-pred-prompt) that appends "Answer with the option's letter from the given choices directly"
Chinese language support via the --lang cn flag
Hint integration by prepending hint text to the question when available
Auto-detection of plain models for mmtag conversation mode switching

Usage

Use this script to generate predictions for MMBench submission. The output JSONL includes question_id, round_id, prompt, text, options, option_char, answer_id, and model_id.

Code Reference

Source Location

Repository: OpenGVLab_InternVL
File: internvl_chat_llava/llava/eval/model_vqa_mmbench.py
Lines: 1-170

Signature

def split_list(lst: list, n: int) -> list: ...

def get_chunk(lst: list, n: int, k: int) -> list: ...

def is_none(value) -> bool: ...

def get_options(row, options: list) -> list: ...

def eval_model(args: argparse.Namespace) -> None: ...

Import

from llava.eval.model_vqa_mmbench import eval_model

I/O Contract

Inputs

Name	Type	Required	Description
--model-path	str	Yes	Path to the pretrained LLaVA model
--model-base	str	No	Base model path for LoRA or projector-only models
--image-folder	str	No	Root directory for image files (not used; images are base64 in TSV)
--question-file	str	No	Path to TSV question file (default: tables/question.jsonl)
--answers-file	str	No	Path for output JSONL answers file (default: answer.jsonl)
--conv-mode	str	No	Conversation template name (default: llava_v1)
--num-chunks	int	No	Number of chunks for multi-GPU splitting (default: 1)
--chunk-idx	int	No	Index of the chunk to process (default: 0)
--temperature	float	No	Sampling temperature (default: 0.2)
--top_p	float	No	Top-p sampling parameter (default: None)
--num_beams	int	No	Number of beams for beam search (default: 1)
--all-rounds	flag	No	Enable multi-round evaluation with option rotation
--single-pred-prompt	flag	No	Append direct answer instruction to prompt
--lang	str	No	Language for instruction prompt (default: "en"; also supports "cn")

Outputs

Name	Type	Description
answers file	JSONL	Each line contains question_id, round_id, prompt, text, options, option_char, answer_id, model_id, and metadata

Usage Examples

Basic Usage

# Command-line execution for MMBench inference
# python internvl_chat_llava/llava/eval/model_vqa_mmbench.py \
#     --model-path /path/to/llava-model \
#     --question-file mmbench_test.tsv \
#     --answers-file mmbench_answers.jsonl \
#     --single-pred-prompt \
#     --temperature 0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment