Implementation:OpenGVLab InternVL ScienceQA Inference

Knowledge Sources	OpenGVLab_InternVL
Domains	Inference, Benchmark, Science_QA
Last Updated	2026-02-07 14:00 GMT

Overview

This script generates model predictions for the ScienceQA benchmark, handling its JSON-based question format with optional images and a two-pass answer prompting strategy.

Description

The model_vqa_science.py script implements the inference pipeline tailored for ScienceQA. Key differences from the generic VQA script include:

JSON input format: Questions are loaded from a single JSON file (not JSONL) with a conversations-style format where the question is extracted from line['conversations'][0]['value']
Optional images: Not all ScienceQA questions have images; the script checks for the 'image' key and passes images=None for text-only questions
Single prediction prompt: When --single-pred-prompt is enabled, appends "Answer with the option's letter from the given choices directly" to focus the model on option selection
Answer prompter mode: When --answer-prompter is enabled, a two-pass inference strategy is used: the first pass generates reasoning, then a second pass with the reasoning appended and "###\\nANSWER:" extracts just the answer letter. The final output combines both as "reasoning \\n The answer is X"
KeywordsStoppingCriteria: Used conditionally for v0 conversation templates

The script uses the standard split_list / get_chunk pattern for multi-GPU evaluation and writes JSONL output with question_id, prompt, text, answer_id, and model_id.

Usage

Use this script to generate predictions for ScienceQA evaluation. The output can then be processed by eval_science_qa.py to compute accuracy metrics.

Code Reference

Source Location

Repository: OpenGVLab_InternVL
File: internvl_chat_llava/llava/eval/model_vqa_science.py
Lines: 1-147

Signature

def split_list(lst: list, n: int) -> list: ...

def get_chunk(lst: list, n: int, k: int) -> list: ...

def eval_model(args: argparse.Namespace) -> None: ...

Import

from llava.eval.model_vqa_science import eval_model

I/O Contract

Inputs

Name	Type	Required	Description
--model-path	str	Yes	Path to the pretrained LLaVA model
--model-base	str	No	Base model path for LoRA or projector-only models
--image-folder	str	No	Root directory for image files
--question-file	str	No	Path to JSON question file (default: tables/question.json)
--answers-file	str	No	Path for output JSONL answers file (default: answer.jsonl)
--conv-mode	str	No	Conversation template name (default: llava_v0)
--num-chunks	int	No	Number of chunks for multi-GPU splitting (default: 1)
--chunk-idx	int	No	Index of the chunk to process (default: 0)
--temperature	float	No	Sampling temperature (default: 0.2)
--answer-prompter	flag	No	Enable two-pass inference with reasoning then answer extraction
--single-pred-prompt	flag	No	Append direct answer instruction to prompt

Outputs

Name	Type	Description
answers file	JSONL	Each line contains question_id, prompt, text, answer_id, model_id, and metadata

Usage Examples

Basic Usage

# Command-line execution for ScienceQA inference
# python internvl_chat_llava/llava/eval/model_vqa_science.py \
#     --model-path /path/to/llava-model \
#     --image-folder /path/to/ScienceQA/images \
#     --question-file ScienceQA/test.json \
#     --answers-file sqa_predictions.jsonl \
#     --single-pred-prompt \
#     --conv-mode llava_v0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment