Implementation:Cleanlab Cleanlab Rank Classes By Label Quality

Knowledge Sources	Cleanlab Cleanlab Docs
Domains	Machine_Learning, Data_Quality
Last Updated	2026-02-09 19:00 GMT

Overview

Concrete tool for ranking all classes in a dataset by their label quality to identify the most problematic classes, provided by the Cleanlab library.

Description

This function computes per-class label quality metrics from the joint distribution of (given_label, true_label) and returns a pandas DataFrame with all classes ranked by their label quality score in ascending order. For each class, it reports the number of label issues (examples mislabeled as this class), inverse label issues (true members of this class that were mislabeled), label noise rate, inverse label noise rate, and an overall quality score. The function accepts either labels and pred_probs (from which it computes the joint internally) or a pre-computed joint/confident_joint matrix.

Usage

Import and use this function when you need to understand which classes in your dataset are the most problematic in terms of label quality. The returned DataFrame allows you to quickly identify classes that need better annotation guidelines, additional annotator training, or focused review. This function is also called internally by health_summary to produce the classes_by_label_quality component.

Code Reference

Source Location

Repository: cleanlab
File: cleanlab/dataset.py
Lines: 16-108

Signature

def rank_classes_by_label_quality(
    labels=None,
    pred_probs=None,
    *,
    class_names=None,
    num_examples=None,
    joint=None,
    confident_joint=None,
    multi_label=False,
) -> pd.DataFrame

Import

from cleanlab.dataset import rank_classes_by_label_quality

I/O Contract

Inputs

Name	Type	Required	Description
labels	Optional[LabelLike]	No	Array of noisy class labels of shape (N,). Required if joint and confident_joint are not provided.
pred_probs	Optional[np.ndarray]	No	Out-of-sample predicted probability matrix of shape (N, K). Required if joint and confident_joint are not provided.
class_names	Optional[Iterable[str]]	No	Human-readable names for the classes. If provided, the DataFrame index uses these names instead of integer indices.
num_examples	Optional[int]	No	Total number of examples in the dataset. Used when providing a pre-computed joint without labels.
joint	Optional[np.ndarray]	No	Pre-computed joint distribution matrix of shape (K, K). If provided, labels and pred_probs are not needed.
confident_joint	Optional[np.ndarray]	No	Pre-computed confident joint matrix. Used to compute the joint if joint is not directly provided.
multi_label	bool	No	If True, handle multi-label classification. Defaults to False.

Outputs

Name	Type	Description
class_quality_df	pd.DataFrame	DataFrame with columns: "Class Index" (int, the class identifier), "Label Issues" (int, count of examples mislabeled as this class), "Inverse Label Issues" (int, count of true members of this class that were mislabeled), "Label Noise" (float, fraction of examples labeled as this class that are wrong), "Inverse Label Noise" (float, fraction of true class members that were mislabeled), "Label Quality Score" (float, 1 minus label noise). Sorted by Label Quality Score ascending (worst classes first).

Usage Examples

Basic Usage

import numpy as np
from cleanlab.dataset import rank_classes_by_label_quality

labels = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2, 2])
pred_probs = np.array([
    [0.9, 0.05, 0.05],
    [0.3, 0.6, 0.1],   # labeled 0 but model thinks 1
    [0.85, 0.1, 0.05],
    [0.1, 0.8, 0.1],
    [0.05, 0.1, 0.85],  # labeled 1 but model thinks 2
    [0.1, 0.7, 0.2],
    [0.1, 0.1, 0.8],
    [0.05, 0.05, 0.9],
    [0.7, 0.1, 0.2],   # labeled 2 but model thinks 0
    [0.0, 0.15, 0.85],
])

df = rank_classes_by_label_quality(labels, pred_probs)
print(df)
# Classes ranked by quality (worst first)
# Shows which classes have the most label issues

With Class Names

from cleanlab.dataset import rank_classes_by_label_quality

df = rank_classes_by_label_quality(
    labels, pred_probs,
    class_names=["cat", "dog", "bird"],
)
print(df)
# DataFrame uses "cat", "dog", "bird" instead of 0, 1, 2

From Pre-Computed Confident Joint

from cleanlab.count import compute_confident_joint
from cleanlab.dataset import rank_classes_by_label_quality

cj = compute_confident_joint(labels, pred_probs)
df = rank_classes_by_label_quality(
    confident_joint=cj,
    num_examples=len(labels),
)
print(df)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment