Implementation:Cleanlab Cleanlab Rank Classes By Label Quality
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Data_Quality |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
Concrete tool for ranking all classes in a dataset by their label quality to identify the most problematic classes, provided by the Cleanlab library.
Description
This function computes per-class label quality metrics from the joint distribution of (given_label, true_label) and returns a pandas DataFrame with all classes ranked by their label quality score in ascending order. For each class, it reports the number of label issues (examples mislabeled as this class), inverse label issues (true members of this class that were mislabeled), label noise rate, inverse label noise rate, and an overall quality score. The function accepts either labels and pred_probs (from which it computes the joint internally) or a pre-computed joint/confident_joint matrix.
Usage
Import and use this function when you need to understand which classes in your dataset are the most problematic in terms of label quality. The returned DataFrame allows you to quickly identify classes that need better annotation guidelines, additional annotator training, or focused review. This function is also called internally by health_summary to produce the classes_by_label_quality component.
Code Reference
Source Location
- Repository: cleanlab
- File: cleanlab/dataset.py
- Lines: 16-108
Signature
def rank_classes_by_label_quality(
labels=None,
pred_probs=None,
*,
class_names=None,
num_examples=None,
joint=None,
confident_joint=None,
multi_label=False,
) -> pd.DataFrame
Import
from cleanlab.dataset import rank_classes_by_label_quality
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| labels | Optional[LabelLike] | No | Array of noisy class labels of shape (N,). Required if joint and confident_joint are not provided. |
| pred_probs | Optional[np.ndarray] | No | Out-of-sample predicted probability matrix of shape (N, K). Required if joint and confident_joint are not provided. |
| class_names | Optional[Iterable[str]] | No | Human-readable names for the classes. If provided, the DataFrame index uses these names instead of integer indices. |
| num_examples | Optional[int] | No | Total number of examples in the dataset. Used when providing a pre-computed joint without labels. |
| joint | Optional[np.ndarray] | No | Pre-computed joint distribution matrix of shape (K, K). If provided, labels and pred_probs are not needed. |
| confident_joint | Optional[np.ndarray] | No | Pre-computed confident joint matrix. Used to compute the joint if joint is not directly provided. |
| multi_label | bool | No | If True, handle multi-label classification. Defaults to False. |
Outputs
| Name | Type | Description |
|---|---|---|
| class_quality_df | pd.DataFrame | DataFrame with columns: "Class Index" (int, the class identifier), "Label Issues" (int, count of examples mislabeled as this class), "Inverse Label Issues" (int, count of true members of this class that were mislabeled), "Label Noise" (float, fraction of examples labeled as this class that are wrong), "Inverse Label Noise" (float, fraction of true class members that were mislabeled), "Label Quality Score" (float, 1 minus label noise). Sorted by Label Quality Score ascending (worst classes first). |
Usage Examples
Basic Usage
import numpy as np
from cleanlab.dataset import rank_classes_by_label_quality
labels = np.array([0, 0, 0, 1, 1, 1, 2, 2, 2, 2])
pred_probs = np.array([
[0.9, 0.05, 0.05],
[0.3, 0.6, 0.1], # labeled 0 but model thinks 1
[0.85, 0.1, 0.05],
[0.1, 0.8, 0.1],
[0.05, 0.1, 0.85], # labeled 1 but model thinks 2
[0.1, 0.7, 0.2],
[0.1, 0.1, 0.8],
[0.05, 0.05, 0.9],
[0.7, 0.1, 0.2], # labeled 2 but model thinks 0
[0.0, 0.15, 0.85],
])
df = rank_classes_by_label_quality(labels, pred_probs)
print(df)
# Classes ranked by quality (worst first)
# Shows which classes have the most label issues
With Class Names
from cleanlab.dataset import rank_classes_by_label_quality
df = rank_classes_by_label_quality(
labels, pred_probs,
class_names=["cat", "dog", "bird"],
)
print(df)
# DataFrame uses "cat", "dog", "bird" instead of 0, 1, 2
From Pre-Computed Confident Joint
from cleanlab.count import compute_confident_joint
from cleanlab.dataset import rank_classes_by_label_quality
cj = compute_confident_joint(labels, pred_probs)
df = rank_classes_by_label_quality(
confident_joint=cj,
num_examples=len(labels),
)
print(df)