Implementation:Cleanlab Cleanlab OD Find Label Issues

API	`object_detection.filter.find_label_issues`
Source	`cleanlab/object_detection/filter.py:L32-38`
Domains	Machine_Learning, Data_Quality, Object_Detection
Last Updated	2026-02-09

Overview

Implementation of label issue detection for object detection datasets. Returns either a boolean mask identifying images with label issues or a ranked list of image indices sorted by issue severity.

Description

This function identifies which images in an object detection dataset contain annotation errors. It works by:

Computing per-image label quality scores using get_label_quality_scores.
Optionally checking for overlapping labels within ground truth annotations.
Either applying a threshold to produce a boolean mask (default) or sorting indices by score to produce a ranked list.

The function supports two output modes controlled by the return_indices_ranked_by_score parameter:

Boolean mask (default): An array of shape (N,) where True indicates the corresponding image has a detected label issue.
Ranked indices: An array of image indices sorted in ascending order by quality score, placing the most problematic images first.

Usage

This function is the primary entry point for binary detection of label issues in object detection datasets. It is typically used after training an object detection model and obtaining predictions. Results can drive dataset cleaning pipelines or prioritized human review workflows.

Code Reference

Source Location

cleanlab/object_detection/filter.py, lines 32-38.

Signature

def find_label_issues(
    labels: List[Dict[str, Any]],
    predictions: List[np.ndarray],
    *,
    return_indices_ranked_by_score: Optional[bool] = False,
    overlapping_label_check: Optional[bool] = True,
) -> np.ndarray

Import

from cleanlab.object_detection.filter import find_label_issues

I/O Contract

Inputs

Parameter	Type	Description
`labels`	`List[Dict[str, Any]]`	List of N dictionaries, each containing `"bboxes"` (np.ndarray of shape M,4 in xyxy format) and `"labels"` (np.ndarray of shape M, with integer class labels).
`predictions`	`List[np.ndarray]`	List of N arrays, each of shape (P, K+5) containing predicted bounding boxes with confidence scores and class probabilities.
`return_indices_ranked_by_score`	`Optional[bool]`	If True, returns sorted image indices instead of a boolean mask. Defaults to False.
`overlapping_label_check`	`Optional[bool]`	If True, checks for overlapping bounding boxes in ground truth labels. Defaults to True.

Outputs

Type	Description
`np.ndarray`	When `return_indices_ranked_by_score=False`: boolean array of shape (N,) where True indicates a label issue. When `return_indices_ranked_by_score=True`: integer array of image indices sorted by quality score ascending (worst first).

Usage Examples

import numpy as np
from cleanlab.object_detection.filter import find_label_issues

# Ground truth labels for 3 images
labels = [
    {
        "bboxes": np.array([[10, 20, 50, 60]]),
        "labels": np.array([0]),
    },
    {
        "bboxes": np.array([[30, 40, 70, 80], [100, 100, 200, 200]]),
        "labels": np.array([1, 2]),
    },
    {
        "bboxes": np.array([[5, 5, 25, 25]]),
        "labels": np.array([0]),
    },
]

# Model predictions (K=3 classes)
predictions = [
    np.array([[10, 20, 50, 60, 0.9, 0.85, 0.10, 0.05]]),
    np.array([
        [30, 40, 70, 80, 0.8, 0.05, 0.90, 0.05],
        [100, 100, 200, 200, 0.7, 0.80, 0.10, 0.10],
    ]),
    np.array([[5, 5, 25, 25, 0.95, 0.90, 0.05, 0.05]]),
]

# Get boolean mask of images with label issues
issue_mask = find_label_issues(labels, predictions)
# issue_mask is np.ndarray of shape (3,) with boolean values

# Get ranked indices (worst images first)
ranked_indices = find_label_issues(
    labels, predictions, return_indices_ranked_by_score=True
)
# ranked_indices is np.ndarray of sorted image indices

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment