Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit MEGABench General Numerical Match

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Evaluation, Numerical Comparison, Mathematics

Overview

Provides numerical matching and comparison utilities for the MEGA-Bench evaluation framework, including LaTeX parsing and list comparison.

Description

This module implements numerical comparison functions adapted from TIGER-AI-Lab's MAmmoTH project. The `eval_with_timeout` function safely evaluates mathematical expressions using multiprocessing with a timeout. The `compare_two_list` function handles list-to-list numerical comparison. It integrates with sympy's LaTeX parser for symbolic expression evaluation and includes a `TimeoutException` handler for long-running computations. The `SimpleStrMatch` class is used as a fallback for non-numerical comparisons.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/megabench/scoring/general_numerical_match.py, Lines: L1-253
  • Import: from vlmeval.dataset.utils.megabench.scoring.general_numerical_match import eval_with_timeout, compare_two_list

Key Functions:

def eval_with_timeout(expression, timeout=5): ...
def compare_two_list(pred, gt): ...
def run_eval(expression, output): ...

I/O Contract

Direction Description
Inputs Mathematical expression strings or numerical values/lists for comparison
Outputs Evaluation results or boolean/float match scores

Usage Examples

from vlmeval.dataset.utils.megabench.scoring.general_numerical_match import eval_with_timeout

result = eval_with_timeout("2 + 3", timeout=5)
# result = 5

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment