Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Compare

From Leeroopedia
Knowledge Sources
Domains Comparison, Reports, Analytics
Last Updated 2026-02-14 12:00 GMT

Overview

Compares multiple Evidently Report snapshots side-by-side, producing a pandas DataFrame with metrics as rows and runs as columns.

Description

The compare module provides a function to combine multiple Snapshot objects into a single comparison DataFrame. This is useful when you have computed reports at different time periods, for different model versions, or on different datasets, and want to see how metric values differ across runs.

The compare function accepts a variable number of Snapshot objects and a flexible CompareIndex parameter that determines how each run is labeled:

  • "timestamp" (default): Uses the snapshot's timestamp as the column header
  • List of strings: Custom labels for each run
  • Callable: A function that takes a Snapshot and returns a custom index value
  • "metadata.<key>": Extracts a value from the snapshot's metadata dictionary

The _get_index helper function resolves the index value for each run based on the chosen index type.

By default, only metrics present in all runs are included. Setting all_metrics=True includes metrics from any run (with None for missing values). Setting use_tests=True outputs test result statuses instead of metric values.

Usage

Use this function after running multiple Evidently reports to compare results across time periods, model versions, or datasets. The output DataFrame is suitable for visual inspection, export, or further analysis.

Code Reference

Source Location

Signature

CompareIndex = Union[str, List[str], Callable[[Snapshot], Any]]

def _get_index(index: CompareIndex, run: Snapshot, i: int) -> Any:

def compare(
    *runs: Snapshot,
    index: CompareIndex = "timestamp",
    all_metrics: bool = False,
    use_tests: bool = False,
) -> pd.DataFrame:

Import

from evidently.core.compare import compare

I/O Contract

Inputs

Name Type Required Description
*runs Snapshot (variadic) Yes One or more Snapshot objects to compare
index CompareIndex No (default: "timestamp") How to index columns: "timestamp", list of strings, callable, or "metadata.<key>"
all_metrics bool No (default: False) If True, include metrics from any run; if False, only include metrics common to all runs
use_tests bool No (default: False) If True, output test status values instead of metric values

Outputs

Name Type Description
return pd.DataFrame Transposed DataFrame with metric names as rows and run indices as columns

Usage Examples

from evidently.core.compare import compare

# Compare two snapshots using default timestamp index
df = compare(snapshot_1, snapshot_2)

# Compare three snapshots with custom labels
df = compare(run_a, run_b, run_c, index=["v1.0", "v1.1", "v2.0"])

# Compare using metadata field as index
df = compare(snap1, snap2, index="metadata.model_version")

# Include all metrics even if not shared across all runs
df = compare(snap1, snap2, all_metrics=True)

# Compare test results instead of metric values
df = compare(snap1, snap2, use_tests=True)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment