Principle:Sail sg LongSpec Math Answer Extraction
| Knowledge Sources | |
|---|---|
| Domains | NLP, Evaluation, Mathematics |
| Last Updated | 2026-02-14 05:00 GMT |
Overview
Algorithmic principle for extracting and normalizing mathematical answers from free-form model-generated text using multi-strategy parsing with LaTeX normalization.
Description
Math Answer Extraction addresses the challenge of comparing model outputs to ground truth answers in mathematical reasoning benchmarks. Model outputs are typically free-form text containing LaTeX, natural language, and code, from which the final answer must be extracted and normalized into a canonical form. The extraction uses a priority-based cascade: (1) \\boxed{} extraction via brace matching, (2) pattern matching for "the answer is" / "answer is" phrases, (3) program output extraction from code blocks, and (4) fallback to the last number in the text. After extraction, LaTeX normalization standardizes fraction notation (\\frac, \\dfrac, \\tfrac), square root shorthand, unit removal, and whitespace cleanup.
Usage
Apply this principle when building evaluation pipelines for math reasoning benchmarks (MATH, GSM8K, MathScale, etc.) where model outputs need to be parsed into comparable answer strings before equivalence checking.
Theoretical Basis
The extraction follows a priority cascade:
# Abstract algorithm (NOT real implementation)
def extract(text):
if has_boxed(text):
return extract_boxed(text) # Brace-matching extraction
elif has_pattern(text, "the answer is"):
return extract_after_pattern(text)
elif has_code_output(text):
return extract_code_output(text)
else:
return extract_last_number(text) # Regex fallback
LaTeX normalization applies a sequence of string transformations:
- Replace \\dfrac, \\tfrac, \\cfrac with \\frac
- Fix shorthand: \\frac12 to \\frac{1}{2}
- Fix shorthand: \\sqrt3 to \\sqrt{3}
- Convert a/b to \\frac{a}{b} for integer fractions
- Remove units, dollar signs, whitespace
- Strip \\left, \\right, \\text{} wrappers