Implementation:Evidentlyai Evidently Group By Metric
| Knowledge Sources | |
|---|---|
| Domains | Metrics, Data Segmentation |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Implements the GroupBy metric container and GroupByMetric class that enable computing any metric on subsets of data grouped by the unique values of a specified column.
Description
The Group By Metric module provides a mechanism to apply any Evidently metric to data segments defined by unique values in a grouping column. This is useful for analyzing metric behavior across different categories, segments, or cohorts.
Key classes:
- GroupBy -- A MetricContainer that takes a metric and a column_name. During metric generation (generate_metrics), it retrieves all unique labels from the specified column and creates a GroupByMetric instance for each label. The label_metric convenience method creates a single GroupByMetric for a specific label value.
- GroupByMetric -- A Metric that wraps another metric with a specific column name and label value. It delegates bound test retrieval to the wrapped metric.
- GroupByMetricCalculation -- The calculation class that executes the wrapped metric on a data subset. It uses Dataset.subdataset to filter both current and reference data to rows matching the target label. The display name is augmented with the grouping context (e.g., "Accuracy group by 'region' for label: 'US'"). The inner calculation is lazily created via metric.metric.to_calculation().
- _patched_display_name -- A helper that wraps the original display_name callable to append grouping context information.
Usage
Use GroupBy when you want to compute a metric (e.g., mean value, drift score, accuracy) separately for each unique value in a categorical column. This is commonly used for fairness analysis, segment-level monitoring, or cohort-level evaluation. Include GroupBy in report metrics to automatically generate per-label metric results.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/metrics/group_by.py
Signature
class GroupByMetric(Metric):
metric: Metric
column_name: str
label: object
def get_bound_tests(self, context: Context) -> Sequence[BoundTest]:
...
class GroupByMetricCalculation(MetricCalculation[TResult, GroupByMetric]):
def calculate(self, context, current_data, reference_data):
...
def display_name(self) -> str:
...
@property
def column_name(self) -> str:
...
@property
def calculation(self) -> MetricCalculation:
...
class GroupBy(MetricContainer):
metric: Metric
column_name: str
def __init__(self, metric: Metric, column_name: str, include_tests: bool = True):
...
def generate_metrics(self, context: Context) -> Sequence[MetricOrContainer]:
...
def label_metric(self, label: object) -> Metric:
...
Import
from evidently.metrics.group_by import GroupBy, GroupByMetric, GroupByMetricCalculation
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| metric | Metric | Yes | The metric to compute on each data segment |
| column_name | str | Yes | The column whose unique values define the data segments |
| label | object | Yes (for GroupByMetric) | The specific label value to filter the data by |
| include_tests | bool | No | Whether to include bound tests (default: True) |
Outputs
| Name | Type | Description |
|---|---|---|
| generate_metrics return | Sequence[MetricOrContainer] | One GroupByMetric per unique label in the grouping column |
| calculate return | Tuple[TResult, Optional[TResult]] | The metric result computed on the filtered data subset, for both current and optional reference data |
| label_metric return | Metric | A GroupByMetric for a specific label value |
Usage Examples
from evidently.metrics.group_by import GroupBy
from evidently.metrics import MeanValue
from evidently.core.report import Report
# Compute mean value of "score" column grouped by "category"
report = Report(metrics=[
GroupBy(metric=MeanValue(column="score"), column_name="category"),
])
# Access metric for a specific label
group_by = GroupBy(metric=MeanValue(column="score"), column_name="region")
us_metric = group_by.label_metric("US")