Implementation:OpenGVLab InternVL TextVQAAccuracyEvaluator

Knowledge Sources	InternVL
Domains	Evaluation, Metrics
Last Updated	2026-02-07 00:00 GMT

Overview

Concrete tool for computing VQA soft accuracy scores provided by the InternVL evaluation framework.

Description

The TextVQAAccuracyEvaluator class computes VQA soft accuracy by:

Normalizing predictions and ground truth answers using EvalAIAnswerProcessor
Computing the soft accuracy score per question (min(1, matching/3) with 10 GT answers)
Averaging across all questions

The companion STVQAANLSEvaluator uses Levenshtein edit distance for ANLS scoring.

Usage

Used by evaluate_vqa.py to compute benchmark scores after distributed inference is complete.

Code Reference

Source Location

Repository: InternVL
File: internvl_chat/eval/vqa/textvqa_eval.py
Lines: L222-258

Signature

class TextVQAAccuracyEvaluator:
    def __init__(self):
        self.answer_processor = EvalAIAnswerProcessor()

    def _compute_answer_scores(self, raw_answers):
        """Compute soft accuracy scores from 10 ground truth answers."""

    def eval_pred_list(self, pred_list, disable_tqdm=False):
        """
        Evaluate a list of predictions against ground truth.

        Args:
            pred_list: List[Dict] with keys 'pred_answer' and 'gt_answers'
            disable_tqdm: bool - Disable progress bar

        Returns:
            float - Average VQA accuracy across all predictions
        """

Import

from eval.vqa.textvqa_eval import TextVQAAccuracyEvaluator

I/O Contract

Inputs

Name	Type	Required	Description
pred_list	List[Dict]	Yes	List of dicts with 'pred_answer' (str) and 'gt_answers' (List[str], 10 answers)

Outputs

Name	Type	Description
accuracy	float	Average VQA soft accuracy (0.0 to 1.0)

Usage Examples

Compute VQA Accuracy

from eval.vqa.textvqa_eval import TextVQAAccuracyEvaluator

evaluator = TextVQAAccuracyEvaluator()

pred_list = [
    {
        'pred_answer': 'golden gate bridge',
        'gt_answers': ['golden gate bridge'] * 8 + ['golden gate', 'bridge']
    },
    {
        'pred_answer': 'cat',
        'gt_answers': ['cat'] * 5 + ['kitten'] * 3 + ['feline', 'kitty']
    },
]

accuracy = evaluator.eval_pred_list(pred_list)
print(f'VQA Accuracy: {accuracy:.4f}')

Related Pages

Implements Principle

Principle:OpenGVLab_InternVL_VQA_Accuracy_Scoring

Requires Environment

Environment:OpenGVLab_InternVL_PyTorch_CUDA

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment