Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit Omni Verifier

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Evaluation, VQA, LLM Judge

Overview

Provides a GPT-based semantic verification system for evaluating model responses against ground truth answers in visual question-answering tasks.

Description

This module implements an evaluation template (`EVAL_TMPL`) that instructs a GPT judge to determine semantic equivalence between model responses and ground truth answers. The verification considers a response correct if it conveys the same meaning (even with different phrasing) or provides additional relevant details, and incorrect if it contradicts the ground truth or includes incorrect information. The `_process_digit_article` function normalizes text by converting word-form numbers to digits and removing articles.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/omni_verifier.py, Lines: L1-220
  • Import: from vlmeval.dataset.utils.omni_verifier import EVAL_TMPL, _process_digit_article

Key Functions:

EVAL_TMPL = """..."""
def _process_digit_article(inText): ...

I/O Contract

Direction Description
Inputs Model response string and ground truth answer string for semantic comparison
Outputs "yes" or "no" string indicating semantic correctness

Usage Examples

from vlmeval.dataset.utils.omni_verifier import EVAL_TMPL

prompt = EVAL_TMPL.format(response="A red car", ground_truth="A red vehicle")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment