Implementation:Open compass VLMEvalKit TableVQABench Utils
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Table Understanding, Visual Question Answering |
Overview
Provides evaluation utilities and vision prompts for the TableVQABench benchmark, covering table-based visual question answering tasks including WikiTableQuestions, TabFact, and FinTabNetQA.
Description
This module implements task-specific prompt templates (VWTQ_PROMPT, VTABFACT_PROMPT, FINTABNETQA_PROMPT) with few-shot examples for different table VQA tasks. The evaluate_tabfact function computes accuracy for true/false fact checking on table images. The module also includes answer normalization and comparison utilities adapted from AllenNLP-SemParse and NAVER AI TabVQABench, handling Unicode normalization, number/infinity detection, and special value processing for robust table answer evaluation.
Usage
Called internally by the TableVQABench dataset class during table-based VQA evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/tablevqabench.py, Lines: L1-500 - Import:
from vlmeval.dataset.utils.tablevqabench import evaluate_tabfact, VWTQ_PROMPT
Key Functions:
VWTQ_PROMPT = '...' # WikiTableQuestions few-shot prompt
VTABFACT_PROMPT = '...' # TabFact true/false prompt
FINTABNETQA_PROMPT = '...' # FinTabNetQA few-shot prompt
def evaluate_tabfact(data, score_keys): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Table image-based question data with predictions and ground-truth answers; score key column names |
| Outputs | Accuracy metrics (correct count, total, percentage); formatted prompts for model input |
Usage Examples
# Internal usage example
from vlmeval.dataset.utils.tablevqabench import evaluate_tabfact, VWTQ_PROMPT
prompt = VWTQ_PROMPT.format(question="What year did sales peak?")
results = evaluate_tabfact(scored_data, ['score'])