Implementation:Evidentlyai Evidently Legacy Data Drift Preset
| Knowledge Sources | |
|---|---|
| Domains | Data Quality, Drift Detection, Model Monitoring |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
The DataDriftPreset class is a metric preset that bundles dataset-level drift detection, per-column drift analysis, and embeddings drift metrics into a single preset for Evidently reports.
Description
DataDriftPreset extends MetricPreset and generates a list of drift-related metrics via its generate_metrics method. The preset always includes:
- DatasetDriftMetric -- detects whether the overall dataset has drifted, using a configurable drift_share threshold (default: 0.5, meaning drift is flagged if more than 50% of columns show drift)
- DataDriftTable -- produces a per-column drift analysis table
Both metrics receive the full suite of statistical test configuration parameters, allowing fine-grained control over which statistical tests are used for different column types (categorical, numerical, text) and individual columns.
When embeddings data is present in the data definition, the preset additionally generates EmbeddingsDriftMetric instances for each embedding, using the helper function add_emb_drift_to_reports. Custom drift methods can be specified per embedding via the embeddings_drift_method parameter.
The preset supports extensive configuration of statistical tests:
- Global defaults via stattest and stattest_threshold
- Type-specific overrides via cat_stattest, num_stattest, text_stattest and their corresponding threshold parameters
- Per-column overrides via per_column_stattest and per_column_stattest_threshold dictionaries
Usage
Use this preset when you want comprehensive data drift analysis across your dataset. It is the standard entry point for detecting feature distribution changes between reference and current data, suitable for production monitoring pipelines.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/legacy/metric_preset/data_drift.py
Signature
class DataDriftPreset(MetricPreset):
class Config:
type_alias = "evidently:metric_preset:DataDriftPreset"
columns: Optional[List[str]]
embeddings: Optional[List[str]]
embeddings_drift_method: Optional[Dict[str, DriftMethod]]
drift_share: float
stattest: Optional[PossibleStatTestType]
cat_stattest: Optional[PossibleStatTestType]
num_stattest: Optional[PossibleStatTestType]
text_stattest: Optional[PossibleStatTestType]
per_column_stattest: Optional[Dict[str, PossibleStatTestType]]
stattest_threshold: Optional[float]
cat_stattest_threshold: Optional[float]
num_stattest_threshold: Optional[float]
text_stattest_threshold: Optional[float]
per_column_stattest_threshold: Optional[Dict[str, float]]
def __init__(self, columns: Optional[List[str]] = None,
embeddings: Optional[List[str]] = None,
embeddings_drift_method: Optional[Dict[str, DriftMethod]] = None,
drift_share: float = 0.5,
stattest: Optional[PossibleStatTestType] = None,
cat_stattest: Optional[PossibleStatTestType] = None,
num_stattest: Optional[PossibleStatTestType] = None,
text_stattest: Optional[PossibleStatTestType] = None,
per_column_stattest: Optional[Dict[str, PossibleStatTestType]] = None,
stattest_threshold: Optional[float] = None,
cat_stattest_threshold: Optional[float] = None,
num_stattest_threshold: Optional[float] = None,
text_stattest_threshold: Optional[float] = None,
per_column_stattest_threshold: Optional[Dict[str, float]] = None): ...
def generate_metrics(self, data_definition: DataDefinition,
additional_data: Optional[Dict[str, Any]]) -> List[AnyMetric]: ...
Import
from evidently.legacy.metric_preset.data_drift import DataDriftPreset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| columns | Optional[List[str]] | No | Specific columns to analyze for drift. If None, all columns are analyzed. |
| embeddings | Optional[List[str]] | No | List of embedding names to analyze for drift |
| embeddings_drift_method | Optional[Dict[str, DriftMethod]] | No | Custom drift detection methods per embedding name |
| drift_share | float | No | Fraction of drifting columns required to flag dataset-level drift (default: 0.5) |
| stattest | Optional[PossibleStatTestType] | No | Default statistical test for all column types |
| cat_stattest | Optional[PossibleStatTestType] | No | Statistical test override for categorical columns |
| num_stattest | Optional[PossibleStatTestType] | No | Statistical test override for numerical columns |
| text_stattest | Optional[PossibleStatTestType] | No | Statistical test override for text columns |
| per_column_stattest | Optional[Dict[str, PossibleStatTestType]] | No | Per-column statistical test overrides |
| stattest_threshold | Optional[float] | No | Default p-value threshold for drift detection |
| cat_stattest_threshold | Optional[float] | No | Threshold override for categorical columns |
| num_stattest_threshold | Optional[float] | No | Threshold override for numerical columns |
| text_stattest_threshold | Optional[float] | No | Threshold override for text columns |
| per_column_stattest_threshold | Optional[Dict[str, float]] | No | Per-column threshold overrides |
Outputs
| Name | Type | Description |
|---|---|---|
| metrics | List[AnyMetric] | A list containing DatasetDriftMetric, DataDriftTable, and optionally EmbeddingsDriftMetric instances |
Usage Examples
from evidently.legacy.metric_preset.data_drift import DataDriftPreset
# Basic usage with defaults
preset = DataDriftPreset()
# With custom drift share threshold and specific columns
preset = DataDriftPreset(
columns=["age", "income", "category"],
drift_share=0.3
)
# With per-type statistical test configuration
preset = DataDriftPreset(
num_stattest="ks",
cat_stattest="chi2",
stattest_threshold=0.05
)
# With embeddings drift detection
preset = DataDriftPreset(
embeddings=["text_embedding"],
embeddings_drift_method={"text_embedding": DriftMethod(name="model")}
)