Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Legacy Data Quality Preset

From Leeroopedia
Knowledge Sources
Domains Data Quality, Model Monitoring, Data Analysis
Last Updated 2026-02-14 12:00 GMT

Overview

The DataQualityPreset class is a metric preset that bundles dataset summary, per-column summary, and missing values analysis metrics for data quality monitoring.

Description

DataQualityPreset extends MetricPreset and generates a concise list of data quality metrics via its generate_metrics method. The preset always includes:

  • DatasetSummaryMetric -- provides an overall summary of the dataset including row counts, column counts, and data types
  • ColumnSummaryMetric (generated for each column) -- provides per-column statistics including distributions, missing values, and summary statistics. Uses generate_column_metrics helper to auto-generate one metric instance per column. The skip_id_column=True flag ensures ID columns are excluded from individual analysis.
  • DatasetMissingValuesMetric -- analyzes missing value patterns across the entire dataset

An optional columns parameter allows restricting the analysis to a specific subset of columns.

Usage

Use this preset when you need a quick overview of data quality, including dataset-level summaries, per-column statistics, and missing value analysis. It is suitable for initial data exploration, ongoing data quality monitoring, and pre-training data validation.

Code Reference

Source Location

Signature

class DataQualityPreset(MetricPreset):
    class Config:
        type_alias = "evidently:metric_preset:DataQualityPreset"

    columns: Optional[List[str]]

    def __init__(self, columns: Optional[List[str]] = None): ...

    def generate_metrics(self, data_definition: DataDefinition,
                         additional_data: Optional[Dict[str, Any]]) -> List[AnyMetric]: ...

Import

from evidently.legacy.metric_preset.data_quality import DataQualityPreset

I/O Contract

Inputs

Name Type Required Description
columns Optional[List[str]] No List of columns to include in per-column summary analysis. If None, all columns (except ID columns) are analyzed.

Outputs

Name Type Description
metrics List[AnyMetric] A list containing DatasetSummaryMetric, a column metric generator for ColumnSummaryMetric, and DatasetMissingValuesMetric

Usage Examples

from evidently.legacy.metric_preset.data_quality import DataQualityPreset

# Basic usage -- analyze all columns
preset = DataQualityPreset()

# Analyze specific columns only
preset = DataQualityPreset(columns=["age", "income", "category", "description"])

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment