Implementation:Open compass VLMEvalKit OlympiadBench Utils
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Mathematics, Olympiad Problems |
Overview
Provides answer extraction and mathematical equivalence checking utilities for the OlympiadBench competition-level math evaluation benchmark.
Description
This module implements answer extraction using GPT-4-style in-context examples (get_gpt4_extract_ICE) with six demonstration examples covering various math answer formats. It uses SymPy for symbolic mathematics comparison including parse_latex, simplify, and Eq for determining equivalence between LaTeX expressions. The module handles diverse mathematical notations including decimal numbers, fractions, intervals, domains/ranges, and complex expressions. It includes timeout_decorator protection for computationally expensive symbolic simplification operations.
Usage
Called internally by the OlympiadBench dataset class during answer evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/olympiadbench.py, Lines: L1-703 - Import:
from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE
Key Functions:
def get_gpt4_extract_ICE(): ...
def is_equiv(str1, str2): ...
def normalize_answer(answer): ...
def evaluate_answer(prediction, reference): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Model response string for answer extraction; predicted and reference answer strings for equivalence checking |
| Outputs | Extracted answer strings; boolean equivalence results for mathematical expressions |
Usage Examples
# Internal usage example
from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE
ice_examples = get_gpt4_extract_ICE()