Implementation:Evidentlyai Evidently Metric Types
| Knowledge Sources | |
|---|---|
| Domains | ML Monitoring, Metrics, Data Quality |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Defines the core metric type system for Evidently's v2 metric framework, including base classes for metric configurations, metric calculations, metric results, and metric tests.
Description
The metric_types module is the foundational type system underpinning Evidently's v2 metrics engine. It provides a comprehensive hierarchy of metric result types, metric configuration classes, metric calculation base classes, and a test binding system that allows automated validation of metric outputs.
Result Types:
- MetricResult -- Abstract base class for all metric results. Contains display metadata, optional visualization widgets, and test results.
- SingleValue -- A metric result holding a single numeric value (float or int). Used for scalar metrics like accuracy, mean, MAE.
- ByLabelValue -- A metric result containing a dictionary mapping labels to individual SingleValue results. Used for per-class metrics like precision per label.
- CountValue -- A metric result combining an absolute count and a share (proportion). Used for counting occurrences such as missing values or duplicates.
- MeanStdValue -- A metric result containing mean and standard deviation. Used for statistical distribution summaries.
- ByLabelCountValue -- A metric result combining per-label counts and per-label shares.
- DataframeValue -- A metric result containing a pandas DataFrame. Used for tabular metric outputs such as correlation matrices.
Configuration Classes:
- Metric -- Abstract base class for metric configurations. Each Metric subclass defines parameters and is associated with a MetricCalculation class that performs the computation.
- SingleValueMetric, ByLabelMetric, CountMetric, MeanStdMetric, DataframeMetric, ByLabelCountMetric -- Specialized Metric base classes for each result type, each with appropriate test binding logic.
- ColumnMetric -- A Metric base that includes a column field for column-specific metrics.
Calculation Classes:
- MetricCalculationBase -- Abstract base providing the call() and calculate() interface. Contains result caching, widget rendering, and resolved parameter support.
- MetricCalculation -- Binds a Metric config to a calculation, auto-registering the metric-to-calculation mapping via __init_subclass__.
- SingleValueCalculation, ByLabelCalculation, CountCalculation, MeanStdCalculation, DataframeCalculation, ByLabelCountCalculation -- Typed calculation base classes with convenience result() methods.
Test Binding System:
- MetricTest -- Base class for defining test conditions on metric results. Tests can be bound to specific metric fingerprints.
- BoundTest -- A test associated with a specific metric instance and value location.
- SingleValueBoundTest, ByLabelBoundTest, CountBoundTest, MeanStdBoundTest, DataframeBoundTest, ByLabelCountBoundTest -- Specialized bound tests for each result type.
- MetricTestResult -- The result of running a test, including status (PASS/FAIL/WARNING/ERROR), description, and configuration.
Location and Rendering:
- MetricConfig -- Frozen model storing metric_id and parameters.
- MetricValueLocation -- Navigates result hierarchies to extract specific values (e.g., a specific label from a ByLabelValue).
- DatasetType -- Enum distinguishing Current vs Reference datasets.
- render_results(), render_widgets() -- Utility functions for HTML rendering of metric results.
- get_default_render(), get_default_render_ref() -- Generate default widget representations for each result type.
Usage
Use this module when:
- Defining new custom metrics by subclassing the appropriate Metric and MetricCalculation pair.
- Consuming metric results from a Report or monitoring pipeline.
- Binding tests to metric values for automated quality checks.
- Building custom rendering for metric outputs.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/core/metric_types.py
Signature
class MetricConfig(FrozenBaseModel):
metric_id: MetricId
params: Dict[str, Any]
class MetricResult(AutoAliasMixin, PolymorphicModel):
display_name: str
widget: Optional[List[BaseWidgetInfo]]
tests: List[MetricTestResult]
class SingleValue(MetricResult):
value: Value # Union[float, int]
class ByLabelValue(MetricResult):
values: Dict[Label, SingleValue]
class CountValue(MetricResult):
count: SingleValue
share: SingleValue
class MeanStdValue(MetricResult):
mean: SingleValue
std: SingleValue
class DataframeValue(MetricResult):
value: pd.DataFrame
class Metric(AutoAliasMixin, EvidentlyBaseModel, Generic[TCalculation]):
def to_calculation(self) -> TCalculation: ...
def get_bound_tests(self, context: "Context") -> Sequence[BoundTest]: ...
class MetricCalculationBase(Generic[TResult]):
def call(self, context: "Context") -> Tuple[TResult, Optional[TResult]]: ...
def calculate(self, context, current_data, reference_data) -> TMetricResult: ...
class MetricCalculation(MetricCalculationBase[TResult], Generic[TResult, TMetric], abc.ABC):
metric: TMetric
class MetricTest(AutoAliasMixin, EvidentlyBaseModel):
is_critical: bool = True
def to_test(self) -> MetricTestProto: ...
def run(self, context, metric, value) -> MetricTestResult: ...
class BoundTest(AutoAliasMixin, EvidentlyBaseModel, Generic[TResult], ABC):
test: MetricTest
metric_fingerprint: Fingerprint
def run_test(self, context, calculation, metric_result): ...
Import
from evidently.core.metric_types import (
Metric,
MetricCalculation,
MetricResult,
SingleValue,
SingleValueMetric,
SingleValueCalculation,
ByLabelValue,
ByLabelMetric,
ByLabelCalculation,
CountValue,
CountMetric,
CountCalculation,
MeanStdValue,
MeanStdMetric,
MeanStdCalculation,
DataframeValue,
DataframeMetric,
DataframeCalculation,
MetricTest,
BoundTest,
MetricTestResult,
MetricConfig,
MetricValueLocation,
DatasetType,
ColumnMetric,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| context | Context | Yes | The report context containing datasets, configuration, and metric results |
| current_data | Dataset | Yes | The current/production dataset to evaluate |
| reference_data | Optional[Dataset] | No | Optional reference/baseline dataset for comparison |
Outputs
| Name | Type | Description |
|---|---|---|
| result | Tuple[TResult, Optional[TResult]] | Tuple of (current_result, optional_reference_result) where TResult is a MetricResult subclass |
| MetricTestResult | MetricTestResult | Contains id, name, description, status (PASS/FAIL/WARNING/ERROR), and configuration |
Usage Examples
Defining a Custom Single Value Metric
from evidently.core.metric_types import (
SingleValueMetric, SingleValueCalculation, TMetricResult
)
from evidently.core.datasets import Dataset
class MyAccuracy(SingleValueMetric):
column: str
class MyAccuracyCalculation(SingleValueCalculation[MyAccuracy]):
def calculate(self, context, current_data: Dataset, reference_data=None) -> TMetricResult:
df = current_data.as_dataframe()
acc = (df["prediction"] == df["target"]).mean()
return self.result(acc)
def display_name(self) -> str:
return f"My Accuracy ({self.metric.column})"
Running a Metric in a Report
from evidently.core.report import Report
from evidently.core.metric_types import SingleValue
report = Report([MyAccuracy(column="prediction")])
snapshot = report.run(current_dataset, reference_dataset)
Accessing Metric Results
# SingleValue result
result = snapshot.get_metric_result("metric_fingerprint_id")
if isinstance(result, SingleValue):
print(f"Value: {result.value}")
# ByLabelValue result
if isinstance(result, ByLabelValue):
for label in result.labels():
sv = result.get_label_result(label)
print(f"Label {label}: {sv.value}")
# CountValue result
if isinstance(result, CountValue):
print(f"Count: {result.count.value}, Share: {result.share.value}")
Related Pages
- Environment:Evidentlyai_Evidently_Python_Core_Environment
- Implementation:Evidentlyai_Evidently_Pydantic_Utils -- Provides base classes (EvidentlyBaseModel, PolymorphicModel, FrozenBaseModel, AutoAliasMixin) used by all metric types
- Implementation:Evidentlyai_Evidently_Recsys_Metrics -- Implements recommender system metrics using the metric type system defined here
- Implementation:Evidentlyai_Evidently_Generated_Descriptors -- Uses FeatureDescriptor which integrates with the metric/report framework