Implementation:Cleanlab Cleanlab Segmentation Get Label Quality Scores

Knowledge Sources	Cleanlab
Domains	Data Quality, Machine Learning, Computer Vision, Semantic Segmentation
Last Updated	2026-02-09 00:00 GMT

Overview

Computes per-image and per-pixel label quality scores for semantic segmentation datasets, and provides a utility to convert scores into binary issue masks.

Description

The cleanlab/segmentation/rank.py module contains two public functions:

get_label_quality_scores computes continuous quality scores at both the image level and the pixel level. It supports two scoring methods: the default "softmin" method, which extracts the predicted probability for each pixel's given class and aggregates to image-level scores using a temperature-controlled softmin function, and the "num_pixel_issues" method, which delegates to find_label_issues to count pixel-level issues and derives image scores from the fraction of non-issue pixels.

issues_from_scores converts the scores produced by get_label_quality_scores into binary issue masks by applying a user-specified threshold. Pixels with quality scores below the threshold are marked as issues. If only image-level scores are provided (without pixel scores), it returns sorted indices of images whose scores fall below the threshold.

Usage

Import get_label_quality_scores when you need continuous quality scores for ranking and prioritizing which images or pixels to review. Import issues_from_scores when you want to convert those scores into a binary issue mask using a custom threshold, especially for use with visualization utilities like display_issues.

Code Reference

Source Location

Repository: Cleanlab
File: cleanlab/segmentation/rank.py
Lines: 14-131 (get_label_quality_scores), 133-186 (issues_from_scores)

Signature (get_label_quality_scores)

def get_label_quality_scores(
    labels: np.ndarray,
    pred_probs: np.ndarray,
    *,
    method: str = "softmin",
    batch_size: Optional[int] = None,
    n_jobs: Optional[int] = None,
    verbose: bool = True,
    **kwargs,
) -> Tuple[np.ndarray, np.ndarray]:

Signature (issues_from_scores)

def issues_from_scores(
    image_scores: np.ndarray,
    pixel_scores: Optional[np.ndarray] = None,
    threshold: float = 0.1,
) -> np.ndarray:

Import

from cleanlab.segmentation.rank import get_label_quality_scores, issues_from_scores

I/O Contract

Inputs (get_label_quality_scores)

Name	Type	Required	Description
labels	np.ndarray	Yes	Discrete array of shape (N, H, W) of integer class labels for each pixel, with values in 0, 1, ..., K-1.
pred_probs	np.ndarray	Yes	Array of shape (N, K, H, W) of model-predicted class probabilities for each pixel.
method	str	No	Scoring method: "softmin" (default) extracts per-pixel predicted probabilities and aggregates with softmin; "num_pixel_issues" counts detected issues per image via find_label_issues.
batch_size	Optional[int]	No	Mini-batch size for the "num_pixel_issues" method. No effect on "softmin".
n_jobs	Optional[int]	No	Number of processes for multiprocessing (Linux only). Only used with "num_pixel_issues".
verbose	bool	No	If True (default), displays progress bars.
temperature	float (via kwargs)	No	Temperature parameter for the softmin aggregation. Default is 0.1. Lower values emphasize the worst pixel; higher values approach the mean.
downsample	int (via kwargs)	No	Downsampling factor for "num_pixel_issues" method. Default is 1.

Outputs (get_label_quality_scores)

Name	Type	Description
image_scores	np.ndarray	Array of shape (N,) with scores between 0 and 1, one per image. Lower scores indicate images more likely to contain label issues.
pixel_scores	np.ndarray	Array of shape (N, H, W) with scores between 0 and 1, one per pixel. Lower scores indicate pixels more likely to be mislabeled.

Inputs (issues_from_scores)

Name	Type	Required	Description
image_scores	np.ndarray	Yes	Array of shape (N,) of per-image quality scores.
pixel_scores	Optional[np.ndarray]	No	Array of shape (N, H, W) of per-pixel quality scores. If provided, returns a boolean mask; otherwise, returns sorted image indices.
threshold	float	No	Quality score cutoff (default 0.1). Pixels or images with scores below this value are marked as issues.

Outputs (issues_from_scores)

Name	Type	Description
issues	np.ndarray	If pixel_scores is provided: boolean mask of shape (N, H, W) where True indicates an issue. If pixel_scores is None: array of integer indices of images with scores below the threshold, sorted by score.

Scoring Methods Detail

Softmin Method (Default)

For each image, the per-pixel score is the model's predicted probability for the given label class at that pixel: pixel_score[i, h, w] = pred_probs[i, labels[i, h, w], h, w]. The image-level score is computed by applying a softmin aggregation (inner product of pixel scores with softmax(1 - pixel_scores)) controlled by a temperature parameter. Lower temperature values cause the image score to be dominated by the worst pixel, while higher temperatures yield scores closer to the average.

Num Pixel Issues Method

Per-pixel scores are computed by masking pred_probs to extract the probability for each pixel's given class. The image-level score is 1 - mean(issue_mask), where issue_mask is the boolean mask returned by find_label_issues. Images with more flagged pixels receive lower image-level scores.

Usage Examples

Basic Usage

import numpy as np
from cleanlab.segmentation.rank import get_label_quality_scores

# N=5 images, K=3 classes, H=32, W=32
labels = np.random.randint(0, 3, size=(5, 32, 32))
pred_probs = np.random.dirichlet([1, 1, 1], size=(5, 32, 32))
pred_probs = np.transpose(pred_probs, (0, 3, 1, 2))

image_scores, pixel_scores = get_label_quality_scores(
    labels, pred_probs, verbose=False
)
print(f"Image scores shape: {image_scores.shape}")   # (5,)
print(f"Pixel scores shape: {pixel_scores.shape}")    # (5, 32, 32)

Converting Scores to Issues

from cleanlab.segmentation.rank import get_label_quality_scores, issues_from_scores

image_scores, pixel_scores = get_label_quality_scores(
    labels, pred_probs, verbose=False
)

# Get boolean mask of pixel-level issues with threshold 0.1
issue_mask = issues_from_scores(image_scores, pixel_scores, threshold=0.1)
print(f"Total pixel issues: {issue_mask.sum()}")

# Get indices of problematic images (without pixel scores)
problem_images = issues_from_scores(image_scores, threshold=0.2)
print(f"Problem image indices: {problem_images}")

Related Pages

Principle:Cleanlab_Cleanlab_Segmentation_Label_Quality_Scoring

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment