Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:ChenghaoMou Text dedup Benchmark Evaluation

From Leeroopedia
Knowledge Sources
Domains Evaluation, Deduplication
Last Updated 2026-02-14 21:00 GMT

Overview

A systematic evaluation methodology that measures deduplication algorithm quality using pairwise precision/recall/F1 on labeled datasets and clustering quality via adjusted Rand index.

Description

Benchmark Evaluation provides a rigorous framework for comparing deduplication algorithms against ground truth. Two evaluation methodologies are used:

CORE dataset evaluation (pairwise): For each document, ground truth specifies which other documents are duplicates. Predictions from the deduplication algorithm are compared per-document: True Positive (TP) if predicted duplicates contain all ground truth duplicates, True Negative (TN) if correctly identified as non-duplicate, False Positive (FP) if non-duplicate predicted as duplicate, False Negative (FN) if duplicate missed. Precision, recall, and macro F1 are computed from these classifications.

NEWS-COPY dataset evaluation (clustering): Ground truth provides cluster labels. The Adjusted Rand Index (ARI) measures agreement between predicted and ground truth clusterings, adjusted for chance.

Usage

Use this principle when evaluating deduplication algorithm quality on labeled benchmark datasets.

Theoretical Basis

Pairwise classification for CORE:

# Abstract evaluation logic (NOT real implementation)
for each document:
    gt_dups = ground_truth_duplicates(document)
    pred_dups = predicted_duplicates(document)
    classification = classify(gt_dups, pred_dups)
    # TP: has dups AND predicted correctly
    # FP: no dups BUT predicted some
    # FN: has dups BUT missed
    # TN: no dups AND predicted none

Adjusted Rand Index: ARI=RIE[RI]max(RI)E[RI]

Where RI is the Rand Index measuring agreement between two clusterings.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment