Implementation:Open compass VLMEvalKit VQA Eval

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Visual Question Answering

Overview

Implements the standard VQA evaluation metric from the VQA challenge, providing answer normalization and accuracy computation.

Description

This module implements the VQA evaluation protocol originally from GT-Vision-Lab, adapted for VLMEvalKit. The _process_digit_article function performs comprehensive text normalization including article removal, number word-to-digit conversion (zero through ten), contraction expansion, and punctuation handling. It maintains a detailed contractions dictionary for English contraction normalization and a manualMap for number word conversion. The evaluation computes accuracy using the standard VQA metric where an answer is considered correct if at least 3 of 10 annotators provided the same answer, with partial credit based on annotator agreement frequency.

Usage

Called internally by VQA-type dataset classes during answer evaluation.

Code Reference

Source: vlmeval/dataset/utils/vqa_eval.py, Lines: L1-363
Import: from vlmeval.dataset.utils.vqa_eval import _process_digit_article

Key Functions:

def _process_digit_article(inText): ...
def _process_punctuation(inText): ...
def compute_vqa_accuracy(prediction, ground_truth_list): ...

I/O Contract

Direction	Description
Inputs	Predicted answer string; list of ground-truth answer strings from multiple annotators
Outputs	Normalized answer strings; VQA accuracy score between 0.0 and 1.0 based on annotator agreement

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.vqa_eval import _process_digit_article
normalized = _process_digit_article("there are three cats")  # Normalizes "three" to "3"

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment