Implementation:Iamhankai Forest of Thought Check
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Mathematics |
| Last Updated | 2026-02-14 03:00 GMT |
Overview
Concrete tool for verifying predicted answers against ground truth provided by the Forest-of-Thought repository.
Description
The check function implements multi-layer answer comparison. It first classifies the answer type (digit, option, yes/no, formula), then applies type-appropriate comparison logic. For mathematical expressions, it delegates to is_equiv() which uses strip_string() normalization and SymPy symbolic comparison. The function handles edge cases like LaTeX vector parsing, percentage conversion, and multi-format answer extraction.
Usage
Called in the result logging step of FoT benchmark evaluation and in CGDM post-processing accuracy evaluation. Central to all accuracy reporting across the framework.
Code Reference
Source Location
- Repository: Forest-of-Thought
- File: utils/utils.py
- Lines: L238-299
Signature
def check(gt, ans, DATA_NAME):
"""
Verify if predicted answer matches ground truth.
Args:
gt (str): Ground truth answer string.
ans (str): Predicted answer string.
DATA_NAME (str): Dataset identifier affecting comparison logic.
Returns:
bool: True if answers are equivalent.
"""
Import
from utils.utils import check
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| gt | str | Yes | Ground truth answer |
| ans | str | Yes | Predicted answer from the model |
| DATA_NAME | str | Yes | Dataset name (gsm8k, math, aime) for type-aware comparison |
Outputs
| Name | Type | Description |
|---|---|---|
| is_correct | bool | True if predicted answer matches ground truth |
Usage Examples
from utils.utils import check
# Numeric comparison
assert check("42", "42", "gsm8k") == True
# LaTeX equivalence
assert check("\\frac{1}{2}", "0.5", "math") == True
# Boxed answer extraction
assert check("\\boxed{7}", "7", "math") == True
# Formula equivalence via SymPy
assert check("x^2 + 2x + 1", "(x+1)^2", "math") == True