Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit OlympiadBench Utils

From Leeroopedia
Revision as of 13:31, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Open_compass_VLMEvalKit_OlympiadBench_Utils.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
source VLMEvalKit
domain Vision, Evaluation, Mathematics, Olympiad Problems

Overview

Provides answer extraction and mathematical equivalence checking utilities for the OlympiadBench competition-level math evaluation benchmark.

Description

This module implements answer extraction using GPT-4-style in-context examples (get_gpt4_extract_ICE) with six demonstration examples covering various math answer formats. It uses SymPy for symbolic mathematics comparison including parse_latex, simplify, and Eq for determining equivalence between LaTeX expressions. The module handles diverse mathematical notations including decimal numbers, fractions, intervals, domains/ranges, and complex expressions. It includes timeout_decorator protection for computationally expensive symbolic simplification operations.

Usage

Called internally by the OlympiadBench dataset class during answer evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/olympiadbench.py, Lines: L1-703
  • Import: from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE

Key Functions:

def get_gpt4_extract_ICE(): ...
def is_equiv(str1, str2): ...
def normalize_answer(answer): ...
def evaluate_answer(prediction, reference): ...

I/O Contract

Direction Description
Inputs Model response string for answer extraction; predicted and reference answer strings for equivalence checking
Outputs Extracted answer strings; boolean equivalence results for mathematical expressions

Usage Examples

# Internal usage example
from vlmeval.dataset.utils.olympiadbench import get_gpt4_extract_ICE
ice_examples = get_gpt4_extract_ICE()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment