Principle:Iamhankai Forest of Thought Answer Equivalence Checking

Knowledge Sources	SymPy Measuring Mathematical Problem Solving with MATH
Domains	Evaluation, Mathematics
Last Updated	2026-02-14 03:00 GMT

Overview

A multi-layer comparison strategy for determining whether a predicted mathematical answer is equivalent to the ground truth, handling diverse formats including numeric, symbolic, and LaTeX representations.

Description

Answer Equivalence Checking addresses the fundamental challenge of evaluating math reasoning: the same answer can be expressed in many different ways (e.g., "1/2", "0.5", "\\frac{1}{2}", "50%"). The pattern implements a cascade of increasingly sophisticated comparison methods:

Direct string match: Simple string equality after normalization
Numeric comparison: Float conversion with tolerance
LaTeX parsing: Extract and compare boxed/formatted answers
Symbolic equivalence: SymPy-based algebraic simplification and comparison
Vector/set comparison: Parse and compare mathematical structures

This is critical for accurate benchmark evaluation, where naive string matching would undercount correct answers.

Usage

Used throughout FoT for evaluating predictions against ground truth. Called by the result logging step in benchmark evaluation and by the CGDM post-processing pipeline for accuracy reporting.

Theoretical Basis

The equivalence check implements a cascaded comparison strategy with decreasing strictness:

# Pseudo-code for answer equivalence cascade
def check(gt, predicted):
    if normalize(gt) == normalize(predicted):
        return True
    if float(gt) == float(predicted):
        return True
    if sympy.simplify(gt - predicted) == 0:
        return True
    return False

Key normalization steps include:

Removing LaTeX formatting (\\text{}, \\mathrm{}, etc.)
Standardizing fractions, square roots, and operators
Converting units and percentages to base form
Handling multiple answer formats (boxed, inline, ####-delimited)

Related Pages

Implemented By

Implementation:Iamhankai_Forest_of_Thought_Check

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment