Implementation:Open compass VLMEvalKit MEGABench General Numerical Match
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Numerical Comparison, Mathematics |
Overview
Provides numerical matching and comparison utilities for the MEGA-Bench evaluation framework, including LaTeX parsing and list comparison.
Description
This module implements numerical comparison functions adapted from TIGER-AI-Lab's MAmmoTH project. The `eval_with_timeout` function safely evaluates mathematical expressions using multiprocessing with a timeout. The `compare_two_list` function handles list-to-list numerical comparison. It integrates with sympy's LaTeX parser for symbolic expression evaluation and includes a `TimeoutException` handler for long-running computations. The `SimpleStrMatch` class is used as a fallback for non-numerical comparisons.
Usage
Called internally by the corresponding dataset class during evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/megabench/scoring/general_numerical_match.py, Lines: L1-253 - Import:
from vlmeval.dataset.utils.megabench.scoring.general_numerical_match import eval_with_timeout, compare_two_list
Key Functions:
def eval_with_timeout(expression, timeout=5): ...
def compare_two_list(pred, gt): ...
def run_eval(expression, output): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Mathematical expression strings or numerical values/lists for comparison |
| Outputs | Evaluation results or boolean/float match scores |
Usage Examples
from vlmeval.dataset.utils.megabench.scoring.general_numerical_match import eval_with_timeout
result = eval_with_timeout("2 + 3", timeout=5)
# result = 5