Principle:Cleanlab Cleanlab Spurious Correlation Analysis

Knowledge Sources	Cleanlab
Domains	Data Quality, Computer Vision, Statistical Analysis
Last Updated	2026-02-09 00:00 GMT

Overview

Spurious correlation analysis quantifies the degree to which individual dataset properties (such as image quality metrics) are predictive of class labels, detecting potential dataset biases and shortcut features that models might exploit.

Description

A spurious correlation exists when a dataset property that is not causally related to the target concept is nonetheless statistically associated with the class labels. For example, in an image classification dataset:

All photos of cats might be taken indoors (low brightness), while all dog photos are outdoors (high brightness). A model could learn to classify based on brightness rather than the actual animal.
Blurry images might disproportionately appear in one class due to data collection artifacts.
Image resolution might correlate with label if different classes were sourced from different databases.

These correlations are "spurious" because they reflect biases in the data collection process rather than genuine features of the target concept. Models trained on such data may:

Achieve misleadingly high validation accuracy by exploiting the shortcut.
Fail catastrophically on new data where the spurious correlation does not hold.
Encode and amplify societal biases present in the training data.

Spurious correlation analysis systematically tests each property to determine if it alone can predict the class label better than chance, providing an early warning system for these issues.

Usage

Apply spurious correlation analysis when:

Auditing image classification datasets where image quality properties are available (e.g., from CleanVision).
Investigating why a model achieves unexpectedly high accuracy on seemingly difficult tasks.
Checking for dataset biases before deploying a model to production.
Building data quality pipelines that need to flag potential shortcut learning opportunities.

Theoretical Basis

Baseline Accuracy

The baseline accuracy represents the accuracy achievable without using any features, by always predicting the most frequent class:

baseline_accuracy = max_k(count(label == k)) / n

where k ranges over all classes and n is the total number of examples. For a balanced binary dataset, this is 0.5; for a highly imbalanced dataset, it can be much higher.

Per-Property Predictiveness

For each property of interest, a GaussianNB (Gaussian Naive Bayes) classifier is trained using 5-fold cross-validation on that single feature to predict the class labels. The mean cross-validated accuracy A_p measures how predictive property p is of the label.

Gaussian Naive Bayes is chosen because:

It is fast to train, enabling efficient evaluation of many properties.
It models continuous features naturally via Gaussian distributions.
It provides a reasonable measure of univariate predictiveness without overfitting.
Cross-validation prevents inflated accuracy estimates from memorization.

Relative Room for Improvement Score

The core metric is the relative room for improvement:

score = (1 - A_p) / (1 - baseline_accuracy)

Interpretation:

The numerator (1 - A_p) is the error rate of the property-based classifier.
The denominator (1 - baseline_accuracy) is the error rate of the baseline (majority class) classifier.
The ratio measures what fraction of the baseline's errors remain when using the property as a predictor.

Score ranges:

score = 1.0: The property is no more predictive than the majority class baseline. No spurious correlation detected.
score close to 0.0: The property alone can nearly perfectly predict the label, indicating a strong spurious correlation.
0 < score < 1: Intermediate predictiveness; lower values indicate stronger correlation.

The score is capped at 1.0 via min(1, ...) to handle cases where the property-based classifier performs worse than baseline (which can happen with cross-validation on small datasets). When the baseline accuracy is perfect (1.0), a small epsilon is added to the denominator to prevent division by zero.

Overall Dataset Assessment

The per-property scores are collected into a summary DataFrame, allowing practitioners to quickly identify which properties exhibit the strongest spurious correlations with the labels. Properties with notably low scores should be investigated to determine whether the correlation reflects a genuine dataset bias that could compromise model reliability.

Related Pages

Implementation:Cleanlab_Cleanlab_Spurious_Correlation_Detection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment