Implementation:Open compass VLMEvalKit OlympiadBench Utils

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Mathematics, Olympiad Problems

Overview

Provides answer extraction and mathematical equivalence checking utilities for the OlympiadBench competition-level math evaluation benchmark.

Description

This module implements answer extraction using GPT-4-style in-context examples (get_gpt4_extract_ICE) with six demonstration examples covering various math answer formats. It uses SymPy for symbolic mathematics comparison including parse_latex, simplify, and Eq for determining equivalence between LaTeX expressions. The module handles diverse mathematical notations including decimal numbers, fractions, intervals, domains/ranges, and complex expressions. It includes timeout_decorator protection for computationally expensive symbolic simplification operations.

Usage

Called internally by the OlympiadBench dataset class during answer evaluation.

Code Reference

Source: vlmeval/dataset/utils/olympiadbench.py, Lines: L1-703
Import: from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE

Key Functions:

def get_gpt4_extract_ICE(): ...
def is_equiv(str1, str2): ...
def normalize_answer(answer): ...
def evaluate_answer(prediction, reference): ...

I/O Contract

Direction	Description
Inputs	Model response string for answer extraction; predicted and reference answer strings for equivalence checking
Outputs	Extracted answer strings; boolean equivalence results for mathematical expressions

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE
ice_examples = get_gpt4_extract_ICE()

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment