Workflow:Cleanlab Cleanlab CleanLearning Robust Training

Knowledge Sources	Cleanlab Cleanlab Docs In-depth Overview Confident Learning
Domains	Data_Centric_AI, Classification, Robust_Training
Last Updated	2026-02-09 19:00 GMT

Overview

End-to-end process for training a robust classifier on noisy labeled data using cleanlab's CleanLearning wrapper.

Description

This workflow uses cleanlab's CleanLearning class to wrap any scikit-learn-compatible classifier and automate the entire label-cleaning pipeline: cross-validation for out-of-sample predictions, label issue detection, removal of mislabeled examples, and retraining on the cleaned dataset. CleanLearning extends sklearn's BaseEstimator interface, so it integrates seamlessly into existing ML pipelines. The result is a model that performs as if it had been trained on correctly labeled data, without requiring manual data cleaning.

Usage

Execute this workflow when you have a classification dataset with potentially noisy labels and want to train a model that is robust to label errors with minimal effort. This is appropriate when you want a single high-level API that handles the entire process (detect issues, clean data, retrain) rather than manually orchestrating the low-level count/filter/rank pipeline. Your classifier must follow the scikit-learn estimator API (fit, predict, predict_proba, score). For non-sklearn models, use adapter libraries like skorch (PyTorch) or SciKeras (Keras).

Execution Steps

Step 1: Prepare Classifier and Data

Select an sklearn-compatible classifier and prepare your feature matrix X and noisy label array y. The classifier must implement fit, predict, predict_proba, and score methods. Ensure the classifier is properly clonable via sklearn.base.clone, as CleanLearning creates multiple instances internally during cross-validation.

Key considerations:

Labels must be integers in 0, 1, ..., K-1 where K is the number of classes
The classifier should support sample_weight in its fit method for optimal results (optional but recommended)
For PyTorch models, use skorch to wrap them as sklearn estimators
For Keras models, use SciKeras to wrap them as sklearn estimators
Neural network weights should be initialized inside fit(), not __init__()

Step 2: Initialize CleanLearning

Create a CleanLearning instance by passing your base classifier. Optionally configure parameters such as the cross-validation strategy (cv_n_folds), the filtering method for label issue detection, and the label quality scoring method.

Key considerations:

Default cross-validation uses 5-fold stratified splitting
The seed parameter controls reproducibility of cross-validation splits
Verbose mode provides progress information during the pipeline
The find_label_issues_kwargs parameter allows fine-tuning of the issue detection stage

Step 3: Find Label Issues

Call find_label_issues with your data and labels to identify mislabeled examples. This internally runs cross-validation to produce out-of-sample predicted probabilities, estimates the confident joint, and applies the configured filtering strategy to identify issues. Returns a DataFrame with per-example label quality scores and issue flags.

Key considerations:

You can optionally pass pre-computed pred_probs to skip cross-validation
A pre-computed confident joint (thresholds) can also be provided
The returned DataFrame contains columns for predicted labels, label quality scores, and issue indicators
This step does not modify the model or data; it only identifies issues

Step 4: Fit on Cleaned Data

Call fit with your data and labels. This method internally runs find_label_issues (if not already done), removes the detected mislabeled examples from the training set, and retrains the classifier on the cleaned subset. The resulting model should perform better than one trained on the full noisy dataset.

Key considerations:

Alternatively, fit can use sample weighting instead of removing issues, controlled by the label_issues_mask parameter
The confident joint and noise matrices are stored as attributes after fitting
The cleaned model is accessible via standard sklearn predict/predict_proba/score methods
Dataset-level statistics (number of issues, noise rates) are stored for inspection

Step 5: Evaluate the Robust Model

Use the trained CleanLearning model to make predictions on test data. Compare performance against a baseline model trained on the uncleaned data. Inspect stored attributes like the confident joint, noise matrices, and per-example label quality scores to understand the noise structure in your data.

Key considerations:

Use predict and predict_proba for inference, same as any sklearn estimator
The confident_joint attribute reveals the estimated noise structure
The label_issues_df attribute contains detailed per-example diagnostics
Compare accuracy, F1, and other metrics against a baseline to quantify improvement

Execution Diagram

GitHub URL

Workflow Repository