Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit VQA Eval

From Leeroopedia
Revision as of 13:33, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Open_compass_VLMEvalKit_VQA_Eval.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
source VLMEvalKit
domain Vision, Evaluation, Visual Question Answering

Overview

Implements the standard VQA evaluation metric from the VQA challenge, providing answer normalization and accuracy computation.

Description

This module implements the VQA evaluation protocol originally from GT-Vision-Lab, adapted for VLMEvalKit. The _process_digit_article function performs comprehensive text normalization including article removal, number word-to-digit conversion (zero through ten), contraction expansion, and punctuation handling. It maintains a detailed contractions dictionary for English contraction normalization and a manualMap for number word conversion. The evaluation computes accuracy using the standard VQA metric where an answer is considered correct if at least 3 of 10 annotators provided the same answer, with partial credit based on annotator agreement frequency.

Usage

Called internally by VQA-type dataset classes during answer evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/vqa_eval.py, Lines: L1-363
  • Import: from vlmeval.dataset.utils.vqa_eval import _process_digit_article

Key Functions:

def _process_digit_article(inText): ...
def _process_punctuation(inText): ...
def compute_vqa_accuracy(prediction, ground_truth_list): ...

I/O Contract

Direction Description
Inputs Predicted answer string; list of ground-truth answer strings from multiple annotators
Outputs Normalized answer strings; VQA accuracy score between 0.0 and 1.0 based on annotator agreement

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.vqa_eval import _process_digit_article
normalized = _process_digit_article("there are three cats")  # Normalizes "three" to "3"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment