Implementation:Cleanlab Cleanlab Get Label Quality Scores

Knowledge Sources	Cleanlab Cleanlab Docs
Domains	Machine_Learning, Data_Quality
Last Updated	2026-02-09 19:00 GMT

Overview

Concrete tool for computing per-example label quality scores that quantify the likelihood each given label is correct, provided by the Cleanlab library.

Description

This function takes noisy labels and out-of-sample predicted probabilities and returns a numeric quality score for each example. The score is between 0 and 1, where lower values indicate labels that are more likely to be incorrect. Three scoring methods are available via the method parameter: self_confidence (predicted probability of the given label), normalized_margin (gap between given label probability and the next best class), and confidence_weighted_entropy (uncertainty-weighted confidence). An optional adjust_pred_probs parameter can be used to modify the predicted probabilities to account for class imbalance before scoring.

Usage

Import and use this function when you need continuous quality scores for all examples in your dataset. This is useful for ranking examples by label quality, setting custom thresholds for flagging issues, or providing scores to downstream functions like order_label_issues. This function is commonly used after or alongside find_label_issues to provide complementary information.

Code Reference

Source Location

Repository: cleanlab
File: cleanlab/rank.py
Lines: 33-117

Signature

def get_label_quality_scores(
    labels: np.ndarray,
    pred_probs: np.ndarray,
    *,
    method: str = "self_confidence",
    adjust_pred_probs: bool = False,
) -> np.ndarray

Import

from cleanlab.rank import get_label_quality_scores

I/O Contract

Inputs

Name	Type	Required	Description
labels	np.ndarray	Yes	Array of noisy class labels of shape (N,) with integer values 0..K-1.
pred_probs	np.ndarray	Yes	Out-of-sample predicted probability matrix of shape (N, K). Each row sums to 1.
method	str	No	Scoring method to use. One of "self_confidence" (default), "normalized_margin", or "confidence_weighted_entropy".
adjust_pred_probs	bool	No	If True, adjust predicted probabilities to account for class imbalance before computing scores. Defaults to False.

Outputs

Name	Type	Description
label_quality_scores	np.ndarray	Array of shape (N,) with quality scores between 0 and 1 for each example. Lower scores indicate labels more likely to be incorrect.

Usage Examples

Basic Usage

import numpy as np
from cleanlab.rank import get_label_quality_scores

labels = np.array([0, 0, 1, 1, 2, 2])
pred_probs = np.array([
    [0.9, 0.05, 0.05],
    [0.2, 0.7, 0.1],   # labeled 0 but model thinks 1
    [0.1, 0.8, 0.1],
    [0.05, 0.1, 0.85],  # labeled 1 but model thinks 2
    [0.1, 0.1, 0.8],
    [0.05, 0.05, 0.9],
])

# Default: self_confidence
scores = get_label_quality_scores(labels, pred_probs)
print("Quality scores:", scores)
# Example output: [0.9, 0.2, 0.8, 0.1, 0.8, 0.9]
# Lower scores for examples 1 and 3 (likely mislabeled)

Comparing Scoring Methods

from cleanlab.rank import get_label_quality_scores

# Self-confidence: P(given_label | x)
scores_sc = get_label_quality_scores(labels, pred_probs, method="self_confidence")

# Normalized margin: P(given_label) - P(next best class)
scores_nm = get_label_quality_scores(labels, pred_probs, method="normalized_margin")

# Confidence-weighted entropy
scores_cwe = get_label_quality_scores(
    labels, pred_probs, method="confidence_weighted_entropy"
)

# All methods rank examples similarly, but with different score distributions
for i in range(len(labels)):
    print(f"Example {i}: SC={scores_sc[i]:.3f}, NM={scores_nm[i]:.3f}, CWE={scores_cwe[i]:.3f}")

Identifying Worst Labels

import numpy as np
from cleanlab.rank import get_label_quality_scores

scores = get_label_quality_scores(labels, pred_probs)

# Get the 3 worst-scoring examples
worst_indices = np.argsort(scores)[:3]
print("Worst labels at indices:", worst_indices)
print("Their scores:", scores[worst_indices])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment