Workflow:Cleanlab Cleanlab Object Detection Label Quality

Knowledge Sources	Cleanlab Cleanlab Docs Object Detection Tutorial ObjectLab
Domains	Data_Centric_AI, Object_Detection, Label_Quality
Last Updated	2026-02-09 19:00 GMT

Overview

End-to-end process for detecting label issues in object detection datasets using cleanlab's ObjectLab method.

Description

This workflow applies cleanlab's object detection module to find label errors in datasets with bounding box annotations. It detects three types of annotation errors: swapped labels (correct box, wrong class), overlooked objects (objects present in the image but missing from annotations), and bad localization (correct class but poorly drawn bounding box). The method computes per-image and per-box label quality scores using IoU-based similarity matching between ground-truth annotations and model predictions, then aggregates scores using a weighted combination of error types.

Usage

Execute this workflow when you have an object detection dataset with bounding box annotations and predictions from a trained object detection model. Each image has a set of ground-truth bounding boxes with class labels, and the model produces predicted bounding boxes with class probabilities. This workflow is appropriate for auditing annotation quality in datasets used for training YOLO, Faster R-CNN, or similar object detection architectures.

Execution Steps

Step 1: Prepare Labels and Predictions

Format your ground-truth annotations and model predictions into the expected list-of-dicts structure. Each image's labels should be a dictionary with bounding boxes and class labels. Each image's predictions should be an array containing predicted bounding boxes, class labels, and confidence scores. Ensure consistent class indexing between labels and predictions.

Key considerations:

Labels format: list of dicts, each with "bboxes" (array of [x1,y1,x2,y2]) and "labels" (array of class indices)
Predictions format: list of arrays, each with shape (M, 6) containing [x1, y1, x2, y2, class_label, confidence_score]
Bounding box coordinates should be in the same coordinate system (pixel coordinates)
Predictions should be from a model not overfit to the training data (use cross-validation or a held-out model)

Step 2: Compute Label Quality Scores

Call the get_label_quality_scores function from the object detection rank module with the formatted labels and predictions. This computes per-image quality scores by matching predicted and annotated boxes via IoU, then scoring each match for three error types (swap, overlooked, bad localization), and aggregating into a single per-image score.

Key considerations:

Aggregation weights can be customized to emphasize specific error types
Overlapping label checking can be enabled to catch duplicate annotations
Lower scores indicate images more likely to contain annotation errors
Per-box auxiliary scores are also computed for detailed diagnosis

Step 3: Identify Images with Label Issues

Use the filter module's find_label_issues function to identify which specific images have annotation errors. This applies thresholding to the quality scores to determine which images are most likely to contain at least one mislabeled, missing, or poorly localized bounding box.

Key considerations:

The returned list identifies images that warrant manual review
Different error types (swap, overlooked, bad localization) can be examined separately
The severity of issues varies; use quality scores to prioritize review

Step 4: Visualize and Summarize Issues

Use the summary module to generate visualizations of detected issues and aggregate statistics. This includes per-class summaries showing which classes have the most annotation errors and visualization functions that overlay detected issues on the original images for human review.

Key considerations:

Visualization shows ground-truth boxes, predicted boxes, and highlights discrepancies
Per-class statistics reveal systematic annotation patterns (e.g., consistent confusion between two classes)
The summary helps prioritize which classes or annotators need the most attention
Export issue lists for downstream annotation correction pipelines

Execution Diagram

GitHub URL

Workflow Repository