Implementation:Cleanlab Cleanlab OD Find Label Issues
| API | object_detection.filter.find_label_issues
|
|---|---|
| Source | cleanlab/object_detection/filter.py:L32-38
|
| Domains | Machine_Learning, Data_Quality, Object_Detection |
| Last Updated | 2026-02-09 |
Overview
Implementation of label issue detection for object detection datasets. Returns either a boolean mask identifying images with label issues or a ranked list of image indices sorted by issue severity.
Description
This function identifies which images in an object detection dataset contain annotation errors. It works by:
- Computing per-image label quality scores using
get_label_quality_scores. - Optionally checking for overlapping labels within ground truth annotations.
- Either applying a threshold to produce a boolean mask (default) or sorting indices by score to produce a ranked list.
The function supports two output modes controlled by the return_indices_ranked_by_score parameter:
- Boolean mask (default): An array of shape (N,) where True indicates the corresponding image has a detected label issue.
- Ranked indices: An array of image indices sorted in ascending order by quality score, placing the most problematic images first.
Usage
This function is the primary entry point for binary detection of label issues in object detection datasets. It is typically used after training an object detection model and obtaining predictions. Results can drive dataset cleaning pipelines or prioritized human review workflows.
Code Reference
Source Location
cleanlab/object_detection/filter.py, lines 32-38.
Signature
def find_label_issues(
labels: List[Dict[str, Any]],
predictions: List[np.ndarray],
*,
return_indices_ranked_by_score: Optional[bool] = False,
overlapping_label_check: Optional[bool] = True,
) -> np.ndarray
Import
from cleanlab.object_detection.filter import find_label_issues
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
labels |
List[Dict[str, Any]] |
List of N dictionaries, each containing "bboxes" (np.ndarray of shape M,4 in xyxy format) and "labels" (np.ndarray of shape M, with integer class labels).
|
predictions |
List[np.ndarray] |
List of N arrays, each of shape (P, K+5) containing predicted bounding boxes with confidence scores and class probabilities. |
return_indices_ranked_by_score |
Optional[bool] |
If True, returns sorted image indices instead of a boolean mask. Defaults to False. |
overlapping_label_check |
Optional[bool] |
If True, checks for overlapping bounding boxes in ground truth labels. Defaults to True. |
Outputs
| Type | Description |
|---|---|
np.ndarray |
When return_indices_ranked_by_score=False: boolean array of shape (N,) where True indicates a label issue. When return_indices_ranked_by_score=True: integer array of image indices sorted by quality score ascending (worst first).
|
Usage Examples
import numpy as np
from cleanlab.object_detection.filter import find_label_issues
# Ground truth labels for 3 images
labels = [
{
"bboxes": np.array([[10, 20, 50, 60]]),
"labels": np.array([0]),
},
{
"bboxes": np.array([[30, 40, 70, 80], [100, 100, 200, 200]]),
"labels": np.array([1, 2]),
},
{
"bboxes": np.array([[5, 5, 25, 25]]),
"labels": np.array([0]),
},
]
# Model predictions (K=3 classes)
predictions = [
np.array([[10, 20, 50, 60, 0.9, 0.85, 0.10, 0.05]]),
np.array([
[30, 40, 70, 80, 0.8, 0.05, 0.90, 0.05],
[100, 100, 200, 200, 0.7, 0.80, 0.10, 0.10],
]),
np.array([[5, 5, 25, 25, 0.95, 0.90, 0.05, 0.05]]),
]
# Get boolean mask of images with label issues
issue_mask = find_label_issues(labels, predictions)
# issue_mask is np.ndarray of shape (3,) with boolean values
# Get ranked indices (worst images first)
ranked_indices = find_label_issues(
labels, predictions, return_indices_ranked_by_score=True
)
# ranked_indices is np.ndarray of sorted image indices