Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sail sg LongSpec LaTeX Normalization Utils

From Leeroopedia
Knowledge Sources
Domains NLP, Evaluation, Mathematics
Last Updated 2026-02-14 05:00 GMT

Overview

Concrete tool for normalizing LaTeX math strings and checking equivalence, providing boxed answer extraction and string-based math comparison from MetaMath.

Description

The math_util.py module provides LaTeX string normalization utilities originally from the MetaMath repository. Key functions include last_boxed_only_string for extracting the last \\boxed{} or \\fbox{} content from a solution string, strip_string for comprehensive LaTeX normalization (fraction fixing, sqrt normalization, unit removal, whitespace cleanup), is_equiv for string-based equivalence checking after normalization, and _clean_numbers for formatting large numbers with commas.

Usage

Import these utilities when you need to compare LaTeX math expressions by string normalization. Used as the equivalence engine for MetaMath-style evaluation in OpenAIMATHCallBack and by math_gold_answer_extractor for extracting gold answers.

Code Reference

Source Location

Signature

def last_boxed_only_string(string: str) -> Optional[str]:
    """Find the last \\boxed{} or \\fbox{} in string and return its full content."""

def strip_string(string: str) -> str:
    """Normalize LaTeX string: remove units, fix fracs/sqrt, standardize formatting."""

def is_equiv(str1: str, str2: str, verbose: bool = False) -> bool:
    """Check if two math strings are equivalent after normalization."""

def fix_fracs(string: str) -> str:
    """Convert \\frac1b -> \\frac{1}{b} shorthand notation."""

def fix_a_slash_b(string: str) -> str:
    """Convert a/b -> \\frac{a}{b} for simple integer fractions."""

def fix_sqrt(string: str) -> str:
    """Convert \\sqrt3 -> \\sqrt{3} shorthand notation."""

class NotEqual:
    """Sentinel object that is never equal to anything."""

Import

from data.math_util import is_equiv, last_boxed_only_string, strip_string

I/O Contract

Inputs

Name Type Required Description
string / str1 / str2 str Yes LaTeX math expression string
verbose bool No Whether to print normalized strings for debugging

Outputs

Name Type Description
result bool Whether two strings are equivalent after normalization
boxed_content str or None Extracted \\boxed{} content (None if not found)
normalized str Normalized LaTeX string

Usage Examples

from data.math_util import is_equiv, last_boxed_only_string, strip_string

# Extract boxed answer
boxed = last_boxed_only_string("The answer is \\boxed{\\frac{1}{2}}")
# boxed = "\\boxed{\\frac{1}{2}}"

# Normalize LaTeX
normalized = strip_string("\\dfrac12")
# normalized = "\\frac{1}{2}"

# Equivalence check
assert is_equiv("\\frac{1}{2}", "\\frac12") == True
assert is_equiv("0.5", "\\frac{1}{2}") == False  # String-based only

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment