Workflow:Cleanlab Cleanlab Object Detection Label Quality
| Knowledge Sources | |
|---|---|
| Domains | Data_Centric_AI, Object_Detection, Label_Quality |
| Last Updated | 2026-02-09 19:00 GMT |
Overview
End-to-end process for detecting label issues in object detection datasets using cleanlab's ObjectLab method.
Description
This workflow applies cleanlab's object detection module to find label errors in datasets with bounding box annotations. It detects three types of annotation errors: swapped labels (correct box, wrong class), overlooked objects (objects present in the image but missing from annotations), and bad localization (correct class but poorly drawn bounding box). The method computes per-image and per-box label quality scores using IoU-based similarity matching between ground-truth annotations and model predictions, then aggregates scores using a weighted combination of error types.
Usage
Execute this workflow when you have an object detection dataset with bounding box annotations and predictions from a trained object detection model. Each image has a set of ground-truth bounding boxes with class labels, and the model produces predicted bounding boxes with class probabilities. This workflow is appropriate for auditing annotation quality in datasets used for training YOLO, Faster R-CNN, or similar object detection architectures.
Execution Steps
Step 1: Prepare Labels and Predictions
Format your ground-truth annotations and model predictions into the expected list-of-dicts structure. Each image's labels should be a dictionary with bounding boxes and class labels. Each image's predictions should be an array containing predicted bounding boxes, class labels, and confidence scores. Ensure consistent class indexing between labels and predictions.
Key considerations:
- Labels format: list of dicts, each with "bboxes" (array of [x1,y1,x2,y2]) and "labels" (array of class indices)
- Predictions format: list of arrays, each with shape (M, 6) containing [x1, y1, x2, y2, class_label, confidence_score]
- Bounding box coordinates should be in the same coordinate system (pixel coordinates)
- Predictions should be from a model not overfit to the training data (use cross-validation or a held-out model)
Step 2: Compute Label Quality Scores
Call the get_label_quality_scores function from the object detection rank module with the formatted labels and predictions. This computes per-image quality scores by matching predicted and annotated boxes via IoU, then scoring each match for three error types (swap, overlooked, bad localization), and aggregating into a single per-image score.
Key considerations:
- Aggregation weights can be customized to emphasize specific error types
- Overlapping label checking can be enabled to catch duplicate annotations
- Lower scores indicate images more likely to contain annotation errors
- Per-box auxiliary scores are also computed for detailed diagnosis
Step 3: Identify Images with Label Issues
Use the filter module's find_label_issues function to identify which specific images have annotation errors. This applies thresholding to the quality scores to determine which images are most likely to contain at least one mislabeled, missing, or poorly localized bounding box.
Key considerations:
- The returned list identifies images that warrant manual review
- Different error types (swap, overlooked, bad localization) can be examined separately
- The severity of issues varies; use quality scores to prioritize review
Step 4: Visualize and Summarize Issues
Use the summary module to generate visualizations of detected issues and aggregate statistics. This includes per-class summaries showing which classes have the most annotation errors and visualization functions that overlay detected issues on the original images for human review.
Key considerations:
- Visualization shows ground-truth boxes, predicted boxes, and highlights discrepancies
- Per-class statistics reveal systematic annotation patterns (e.g., consistent confusion between two classes)
- The summary helps prioritize which classes or annotators need the most attention
- Export issue lists for downstream annotation correction pipelines