Principle:Cleanlab Cleanlab Class Level Quality Ranking

Knowledge Sources	Confident Learning Cleanlab
Domains	Machine_Learning, Data_Quality
Last Updated	2026-02-09 19:00 GMT

Overview

Method for ranking all classes in a dataset by their label quality to identify which classes suffer from the most systematic labeling errors.

Description

Class-level quality ranking computes per-class label quality metrics from the joint distribution matrix. For each class, it estimates several quantities that characterize different aspects of label quality:

Label issues -- the number of examples labeled as this class that are estimated to actually belong to a different class (false positives for this class).
Inverse label issues -- the number of examples that truly belong to this class but are labeled as a different class (false negatives for this class).
Label noise -- the fraction of examples labeled as this class that are estimated to be incorrect.
Inverse label noise -- the fraction of true members of this class that received an incorrect label.
Label quality score -- a summary quality metric equal to 1 minus the label noise rate.

Classes are ranked by quality score in ascending order so the most problematic classes appear first, enabling dataset curators to focus annotation correction efforts on the classes that will yield the most improvement.

Usage

Use to identify which classes in your dataset have the most labeling problems, so you can focus annotation correction efforts on the most impactful classes. This is particularly valuable for datasets with many classes where some classes may be inherently more ambiguous or harder to annotate correctly.

Theoretical Basis

Given the estimated joint distribution J of shape (K, K) where J[i][j] represents the proportion of examples with given label i and true label j:

Label issues for class k:

label_issues[k] = sum(J[k, :]) - J[k, k]

This counts examples labeled as class k whose true label is estimated to be a different class.

Inverse label issues for class k:

inverse_label_issues[k] = sum(J[:, k]) - J[k, k]

This counts examples whose true label is class k but that were given a different label.

Label noise for class k:

label_noise[k] = label_issues[k] / sum(J[k, :])

The fraction of examples labeled as class k that are estimated to be mislabeled.

Inverse label noise for class k:

inverse_label_noise[k] = inverse_label_issues[k] / sum(J[:, k])

The fraction of true class k examples that received an incorrect label.

Label quality score for class k:

quality_score[k] = 1 - label_noise[k] = J[k, k] / sum(J[k, :])

Classes are sorted by quality_score ascending, so the most problematic classes appear first.

Related Pages

Implemented By

Implementation:Cleanlab_Cleanlab_Rank_Classes_By_Label_Quality

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment