Principle:Princeton nlp SimPO Benchmark Decontamination

Knowledge Sources	SimPO Contamination in LLM Benchmarks
Domains	Data_Quality, Evaluation, NLP
Last Updated	2026-02-08 04:30 GMT

Overview

A data filtering technique that removes benchmark evaluation content from training datasets to prevent artificially inflated evaluation scores.

Description

Benchmark Decontamination is a data quality practice that detects and removes training samples containing content from evaluation benchmarks. When training data overlaps with evaluation benchmarks (e.g., HumanEval, MBPP), the model effectively memorizes evaluation answers, producing inflated scores that do not reflect genuine capability. Decontamination addresses this by scanning training text for substring matches against known benchmark content (docstrings, prompts, canonical solutions) and excluding contaminated samples. A whitelist of trivially simple patterns (e.g., return x + y) avoids over-filtering common idioms that appear in both benchmarks and legitimate training data.

Usage

Apply this principle when curating training data for supervised fine-tuning or preference optimization where the model will later be evaluated on code generation benchmarks such as HumanEval. It is essential for any training pipeline that draws from broad web-scraped or code-based corpora where benchmark content may inadvertently appear.

Theoretical Basis

The core mechanism is substring containment checking with normalization:

Pseudo-code Logic:

# Abstract decontamination algorithm
benchmark_strings = load_benchmark_docstrings() + load_benchmark_solutions()
trivial_strings = define_trivial_patterns()

for sample in training_data:
    normalized_sample = normalize(sample.lower())
    contaminated = False
    for ref_string in benchmark_strings:
        if ref_string not in trivial_strings:
            if normalize(ref_string.lower()) in normalized_sample:
                contaminated = True
                break
    if not contaminated:
        yield sample  # Keep only clean samples

Key design decisions:

Case-insensitive matching reduces false negatives from formatting differences
Whitespace normalization collapses formatting variations into a canonical form
Trivial pattern whitelist prevents removal of ubiquitous code idioms that happen to appear in benchmarks

Related Pages

Implementation:Princeton_nlp_SimPO_Decontaminate_Humaneval

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment