Heuristic:Evidentlyai Evidently Drift Detection Thresholds
| Knowledge Sources | |
|---|---|
| Domains | Data_Drift, ML_Monitoring |
| Last Updated | 2026-02-14 10:00 GMT |
Overview
Default threshold values and drift share parameter for determining when individual features and entire datasets are considered "drifted".
Description
Evidently uses two levels of drift determination: per-feature drift (controlled by the statistical test threshold) and dataset-level drift (controlled by `drift_share`). The default statistical test threshold is 0.05, and the default dataset drift share is 0.5 (50% of features must drift for the dataset to be considered drifted). The `DataDriftOptions` class provides fine-grained control to set thresholds per feature type, per individual feature, or globally. Understanding these defaults is essential for correctly interpreting drift results and tuning sensitivity.
Usage
This heuristic applies when configuring data drift detection sensitivity. Use it to understand why Evidently flags (or doesn't flag) drift in your monitoring pipeline, and how to tune the thresholds for your specific use case.
The Insight (Rule of Thumb)
- Default per-feature threshold: 0.05 (the p-value threshold for hypothesis tests, or distance threshold for distance-based tests).
- Default drift share: 0.5 (50% of features must individually drift for the dataset to be "drifted").
- Default histogram bins: 10 (the `DEFAULT_NBINSX` for drift visualization).
- Threshold hierarchy: `per_feature_threshold` > `{type}_features_threshold` > `all_features_threshold` > stat test default. Most specific wins.
- Stat test hierarchy: `per_feature_stattest` > `{type}_features_stattest` > `all_features_stattest`. Most specific wins.
- Deprecated `confidence` parameter: If used, threshold = 1.0 - confidence. Cannot be set simultaneously with `threshold`.
- Trade-off: Lower thresholds increase sensitivity (more false positives, fewer missed drifts). Higher thresholds reduce noise but may miss real drift.
Reasoning
The 0.05 threshold follows the standard convention in statistics for significance testing. For distance-based measures (Wasserstein, Jensen-Shannon), the threshold represents a normalized distance rather than a p-value, but 0.05-0.1 is still commonly used as a practical cutoff.
The 50% drift share is a balanced default: if more than half of all features show drift, the dataset has likely changed meaningfully. This prevents over-alerting when only one or two noisy features fluctuate, while still catching systematic distribution shifts. In practice, users may want to lower this for critical monitoring (0.3) or raise it for noisy environments (0.7).
The text drift threshold of 0.55 (ROC AUC of a domain classifier) is slightly above random (0.5), meaning even a small ability to distinguish reference from current text indicates drift. The bootstrap variant compares against the random classifier percentile for more robust detection.
Code Evidence
Default drift share from `src/evidently/legacy/options/data_drift.py:47`:
class DataDriftOptions(BaseModel):
DEFAULT_NBINSX: ClassVar = 10
drift_share: float = 0.5
Threshold resolution hierarchy from `src/evidently/legacy/utils/data_drift_utils.py:79-102`:
def _calculate_threshold(feature_name, feature_type,
stattest_threshold, cat_stattest_threshold,
num_stattest_threshold, text_stattest_threshold,
per_column_stattest_threshold):
if per_column_stattest_threshold is not None and feature_name in per_column_stattest_threshold:
return per_column_stattest_threshold.get(feature_name)
if cat_stattest_threshold is not None and feature_type == "cat":
return cat_stattest_threshold
if num_stattest_threshold is not None and feature_type == "num":
return num_stattest_threshold
if text_stattest_threshold is not None and feature_type == "text":
return text_stattest_threshold
if stattest_threshold is not None:
return stattest_threshold
return None
Deprecated confidence conversion from `src/evidently/legacy/options/data_drift.py:108-121`:
if self.confidence is not None and threshold is not None:
raise ValueError("Only DataDriftOptions.confidence or DataDriftOptions.threshold can be set")
if self.confidence is not None:
warnings.warn("DataDriftOptions.confidence is deprecated, use DataDriftOptions.threshold instead.")
if isinstance(self.confidence, float):
return 1.0 - self.confidence
Text drift default threshold from `src/evidently/legacy/utils/data_drift_utils.py:135-141`:
def calculate_text_drift_score(reference_data, current_data,
bootstrap, p_value=0.05, threshold=0.55):
# ...
if not bootstrap:
return domain_classifier_roc_auc, domain_classifier_roc_auc > threshold