Implementation:Evidentlyai Evidently Legacy Data Quality Preset
| Knowledge Sources | |
|---|---|
| Domains | Data Quality, Model Monitoring, Data Analysis |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
The DataQualityPreset class is a metric preset that bundles dataset summary, per-column summary, and missing values analysis metrics for data quality monitoring.
Description
DataQualityPreset extends MetricPreset and generates a concise list of data quality metrics via its generate_metrics method. The preset always includes:
- DatasetSummaryMetric -- provides an overall summary of the dataset including row counts, column counts, and data types
- ColumnSummaryMetric (generated for each column) -- provides per-column statistics including distributions, missing values, and summary statistics. Uses generate_column_metrics helper to auto-generate one metric instance per column. The skip_id_column=True flag ensures ID columns are excluded from individual analysis.
- DatasetMissingValuesMetric -- analyzes missing value patterns across the entire dataset
An optional columns parameter allows restricting the analysis to a specific subset of columns.
Usage
Use this preset when you need a quick overview of data quality, including dataset-level summaries, per-column statistics, and missing value analysis. It is suitable for initial data exploration, ongoing data quality monitoring, and pre-training data validation.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/legacy/metric_preset/data_quality.py
Signature
class DataQualityPreset(MetricPreset):
class Config:
type_alias = "evidently:metric_preset:DataQualityPreset"
columns: Optional[List[str]]
def __init__(self, columns: Optional[List[str]] = None): ...
def generate_metrics(self, data_definition: DataDefinition,
additional_data: Optional[Dict[str, Any]]) -> List[AnyMetric]: ...
Import
from evidently.legacy.metric_preset.data_quality import DataQualityPreset
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| columns | Optional[List[str]] | No | List of columns to include in per-column summary analysis. If None, all columns (except ID columns) are analyzed. |
Outputs
| Name | Type | Description |
|---|---|---|
| metrics | List[AnyMetric] | A list containing DatasetSummaryMetric, a column metric generator for ColumnSummaryMetric, and DatasetMissingValuesMetric |
Usage Examples
from evidently.legacy.metric_preset.data_quality import DataQualityPreset
# Basic usage -- analyze all columns
preset = DataQualityPreset()
# Analyze specific columns only
preset = DataQualityPreset(columns=["age", "income", "category", "description"])