Implementation:Open compass VLMEvalKit VQA Eval
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Visual Question Answering |
Overview
Implements the standard VQA evaluation metric from the VQA challenge, providing answer normalization and accuracy computation.
Description
This module implements the VQA evaluation protocol originally from GT-Vision-Lab, adapted for VLMEvalKit. The _process_digit_article function performs comprehensive text normalization including article removal, number word-to-digit conversion (zero through ten), contraction expansion, and punctuation handling. It maintains a detailed contractions dictionary for English contraction normalization and a manualMap for number word conversion. The evaluation computes accuracy using the standard VQA metric where an answer is considered correct if at least 3 of 10 annotators provided the same answer, with partial credit based on annotator agreement frequency.
Usage
Called internally by VQA-type dataset classes during answer evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/vqa_eval.py, Lines: L1-363 - Import:
from vlmeval.dataset.utils.vqa_eval import _process_digit_article
Key Functions:
def _process_digit_article(inText): ...
def _process_punctuation(inText): ...
def compute_vqa_accuracy(prediction, ground_truth_list): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Predicted answer string; list of ground-truth answer strings from multiple annotators |
| Outputs | Normalized answer strings; VQA accuracy score between 0.0 and 1.0 based on annotator agreement |
Usage Examples
# Internal usage example
from vlmeval.dataset.utils.vqa_eval import _process_digit_article
normalized = _process_digit_article("there are three cats") # Normalizes "three" to "3"