Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit TableVQABench Utils

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Evaluation, Table Understanding, Visual Question Answering

Overview

Provides evaluation utilities and vision prompts for the TableVQABench benchmark, covering table-based visual question answering tasks including WikiTableQuestions, TabFact, and FinTabNetQA.

Description

This module implements task-specific prompt templates (VWTQ_PROMPT, VTABFACT_PROMPT, FINTABNETQA_PROMPT) with few-shot examples for different table VQA tasks. The evaluate_tabfact function computes accuracy for true/false fact checking on table images. The module also includes answer normalization and comparison utilities adapted from AllenNLP-SemParse and NAVER AI TabVQABench, handling Unicode normalization, number/infinity detection, and special value processing for robust table answer evaluation.

Usage

Called internally by the TableVQABench dataset class during table-based VQA evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/tablevqabench.py, Lines: L1-500
  • Import: from vlmeval.dataset.utils.tablevqabench import evaluate_tabfact, VWTQ_PROMPT

Key Functions:

VWTQ_PROMPT = '...'        # WikiTableQuestions few-shot prompt
VTABFACT_PROMPT = '...'    # TabFact true/false prompt
FINTABNETQA_PROMPT = '...' # FinTabNetQA few-shot prompt

def evaluate_tabfact(data, score_keys): ...

I/O Contract

Direction Description
Inputs Table image-based question data with predictions and ground-truth answers; score key column names
Outputs Accuracy metrics (correct count, total, percentage); formatted prompts for model input

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.tablevqabench import evaluate_tabfact, VWTQ_PROMPT
prompt = VWTQ_PROMPT.format(question="What year did sales peak?")
results = evaluate_tabfact(scored_data, ['score'])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment