Heuristic:Cleanlab Cleanlab Object Detection Scoring Constants
| Knowledge Sources | |
|---|---|
| Domains | Object_Detection, Label_Quality |
| Last Updated | 2026-02-09 19:30 GMT |
Overview
Empirically-tuned constants for ObjectLab's per-box and per-image label quality scoring, controlling IoU thresholds, similarity weighting, softmin temperature, and issue type detection sensitivity.
Description
ObjectLab (cleanlab's object detection module) uses a set of carefully tuned constants to compute label quality scores at the bounding box and image level. These constants control the similarity computation between predicted and annotated boxes, the pooling of per-box scores into per-image scores, and the thresholds for identifying three types of issues: overlooked objects, badly located boxes, and swapped class labels. Understanding these constants is essential for tuning detection sensitivity.
Usage
This heuristic is applied automatically in get_label_quality_scores and find_label_issues for object detection data. Understanding these constants helps when:
- Too many or too few boxes are flagged as issues (adjust threshold factors)
- The scoring seems biased toward IoU or distance (adjust ALPHA)
- Per-image scores are too harsh or lenient (adjust TEMPERATURE)
The Insight (Rule of Thumb)
- IoU and Similarity:
- `IOU_THRESHOLD = 0.5` — Predicted and annotated boxes must overlap at least 50% to be considered matching
- `ALPHA = 0.9` — Similarity matrix weights IoU at 90% and spatial distance at 10%
- `LABEL_OVERLAP_THRESHOLD = 0.95` — Two annotated boxes must overlap >= 95% to be flagged as conflicting duplicates
- `EUC_FACTOR = 0.1` — Controls how rapidly spatial distance falls off in the similarity calculation
- Probability Thresholds:
- `LOW_PROBABILITY_THRESHOLD = 0.5` — Minimum predicted class probability for considering a box when detecting bad locations
- `HIGH_PROBABILITY_THRESHOLD = 0.95` — High-confidence threshold for considering predicted boxes when detecting overlooked and swapped labels
- Issue Detection Sensitivity:
- `OVERLOOKED_THRESHOLD_FACTOR = 0.8` — Boxes with quality score below `0.8 * threshold` are flagged as overlooked
- `BADLOC_THRESHOLD_FACTOR = 0.8` — Boxes with quality score below `0.8 * threshold` are flagged as badly located
- `SWAP_THRESHOLD_FACTOR = 0.8` — Boxes with quality score below `0.8 * threshold` are flagged as class swaps
- `AP_SCALE_FACTOR = 0.25` — Per-class precision scale factor for global issue determination
- Score Aggregation:
- `TEMPERATURE = 0.1` — Low temperature makes softmin pooling act more like minimum pooling (harsh scoring)
- Equal 1/3 weights for overlooked, badloc, and swap subtypes in overall image quality score
- Safety:
- `MAX_ALLOWED_BOX_PRUNE = 0.97` — Warns if more than 97% of boxes would be pruned (threshold too aggressive)
Reasoning
The `IOU_THRESHOLD = 0.5` matches the standard COCO evaluation metric (AP@0.5), making cleanlab's notion of "matching box" consistent with standard benchmarks.
`ALPHA = 0.9` strongly favors IoU over spatial distance, reflecting that overlap is a much stronger signal than centroid distance for bounding box matching. Spatial distance only matters when boxes are near-misses.
`TEMPERATURE = 0.1` is deliberately low to make per-image scores dominated by the worst per-box score (softmin approaches min as temperature approaches 0). This is conservative: a single badly annotated box should significantly reduce the image-level quality score.
The equal 1/3 weighting across issue subtypes treats overlooked, bad location, and swap errors as equally important by default.
`MAX_ALLOWED_BOX_PRUNE = 0.97` provides a safety check: if a threshold would remove almost all boxes, it is likely too aggressive, and a warning is issued.
Code Evidence:
All constants from `cleanlab/internal/constants.py:9-38`:
EUC_FACTOR = 0.1
MAX_ALLOWED_BOX_PRUNE = 0.97
IOU_THRESHOLD = 0.5
EPSILON = 1e-6
ALPHA = 0.9
LOW_PROBABILITY_THRESHOLD = 0.5
HIGH_PROBABILITY_THRESHOLD = 0.95
TEMPERATURE = 0.1
LABEL_OVERLAP_THRESHOLD = 0.95
OVERLOOKED_THRESHOLD_FACTOR = 0.8
BADLOC_THRESHOLD_FACTOR = 0.8
SWAP_THRESHOLD_FACTOR = 0.8
AP_SCALE_FACTOR = 0.25
CUSTOM_SCORE_WEIGHT_OVERLOOKED = 1 / 3
CUSTOM_SCORE_WEIGHT_BADLOC = 1 / 3
CUSTOM_SCORE_WEIGHT_SWAP = 1 / 3
MAX_CLASS_TO_SHOW = 10
Default issue threshold from `cleanlab/object_detection/rank.py:126`:
def issues_from_scores(label_quality_scores: np.ndarray, *, threshold: float = 0.1)