Implementation:Evidentlyai Evidently Legacy Classification Calculations
| Knowledge Sources | |
|---|---|
| Domains | ML Monitoring, Classification |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Provides core calculation functions for evaluating classification model performance, including confusion matrix decomposition, prediction extraction with threshold handling, precision/recall/lift table generation, and comprehensive quality metric computation (accuracy, precision, recall, F1, ROC AUC, log loss).
Description
This module contains stateless functions used by Evidently's legacy classification metrics to compute performance statistics from pandas DataFrames. It handles both binary and multiclass classification scenarios, with special attention to probability-based predictions and threshold tuning.
Key functions:
- calculate_confusion_by_classes -- Decomposes a confusion matrix into per-class TP, TN, FP, and FN counts.
- get_prediction_data -- The most complex function in the module. It interprets the prediction column(s) from a DataFrame according to various classification scenarios:
- Multiclass: columns are class probabilities; argmax yields predicted label.
- Binary with two probability columns: applies a threshold to the positive label column.
- Binary with a single probability column and string/integer targets: infers positive and negative labels, constructs a full probability DataFrame, and applies the threshold.
- Non-probabilistic: returns raw prediction values.
- k_probability_threshold -- Computes the probability cutoff at the k-th ranked observation or at a fractional percentile.
- threshold_probability_labels -- Converts probability values to class labels using a threshold.
- calculate_pr_table -- Builds a precision-recall table at 5% step intervals.
- calculate_lift_table -- Builds a lift table at 1% step intervals, including lift, max lift, relative lift, and F1.
- calculate_matrix -- Wraps scikit-learn's confusion_matrix and returns a typed ConfusionMatrix result.
- collect_plot_data -- Extracts box-plot statistics (min, 25%, 50%, 75%, max) from prediction probability columns.
- calculate_metrics -- The main entry point that assembles a DatasetClassificationQuality object containing accuracy, precision, recall, F1, and optionally ROC AUC, log loss, TPR/TNR/FPR/FNR, rate plot data, and box plot data.
Usage
These functions are called internally by Evidently classification metrics (e.g., ClassificationQualityMetric, ClassificationConfusionMatrix). They can also be used standalone for computing classification statistics from pandas DataFrames with scikit-learn-compatible target/prediction columns.
Code Reference
Source Location
- Repository: Evidentlyai_Evidently
- File:
src/evidently/legacy/calculations/classification_performance.py
Signature
def calculate_confusion_by_classes(
confusion_matrix: np.ndarray, class_names: Sequence[Union[str, int, None]]
) -> Dict[Label, Dict[str, int]]: ...
def get_prediction_data(
data: pd.DataFrame, data_columns: DatasetColumns, pos_label: Optional[Union[str, int]], threshold: float = 0.5
) -> PredictionData: ...
def k_probability_threshold(
prediction_probas: pd.DataFrame, k: Optional[int] = None, prob_threshold: Optional[float] = None
) -> float: ...
def threshold_probability_labels(
prediction_probas: pd.DataFrame, pos_label: Union[str, int], neg_label: Union[str, int], threshold: float
) -> pd.Series: ...
def calculate_pr_table(binded) -> list: ...
def calculate_lift_table(binded) -> list: ...
def calculate_matrix(
target: pd.Series, prediction: pd.Series, labels: List[Label]
) -> ConfusionMatrix: ...
def collect_plot_data(prediction_probas: pd.DataFrame) -> Boxes: ...
def calculate_metrics(
column_mapping: ColumnMapping,
confusion_matrix: ConfusionMatrix,
target: pd.Series,
prediction: PredictionData,
) -> DatasetClassificationQuality: ...
Import
from evidently.legacy.calculations.classification_performance import (
calculate_confusion_by_classes,
get_prediction_data,
k_probability_threshold,
threshold_probability_labels,
calculate_pr_table,
calculate_lift_table,
calculate_matrix,
collect_plot_data,
calculate_metrics,
)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | pd.DataFrame |
Yes | Source dataset containing target and prediction columns. |
| data_columns | DatasetColumns |
Yes | Metadata describing which columns are target, prediction, features, etc. |
| pos_label | Optional[Union[str, int]] |
No | Positive class label for binary classification. Required for binary tasks. |
| threshold | float |
No | Probability threshold for converting probabilities to labels (default 0.5). |
| column_mapping | ColumnMapping |
Yes (for calculate_metrics) | Column mapping providing pos_label and other settings. |
| confusion_matrix | ConfusionMatrix |
Yes (for calculate_metrics) | Pre-computed confusion matrix. |
| target | pd.Series |
Yes (for calculate_metrics) | Ground-truth labels. |
| prediction | PredictionData |
Yes (for calculate_metrics) | Predicted labels and optional probabilities. |
| binded | list |
Yes (for PR/lift tables) | List of (label, probability) tuples for ranked evaluation. |
Outputs
| Name | Type | Description |
|---|---|---|
| PredictionData | PredictionData |
Structured predictions with labels and optional probability DataFrame. |
| ConfusionMatrix | ConfusionMatrix |
Labeled confusion matrix with sorted labels. |
| DatasetClassificationQuality | DatasetClassificationQuality |
Full classification quality metrics including accuracy, precision, recall, F1, ROC AUC, log loss, TPR/TNR/FPR/FNR, and plot data. |
| Boxes | Boxes |
Box-plot statistics (min, Q1, median, Q3, max) for probability distributions. |
| confusion_by_classes | Dict[Label, Dict[str, int]] |
Per-class TP/TN/FP/FN counts. |
Usage Examples
import pandas as pd
import numpy as np
from evidently.legacy.calculations.classification_performance import (
get_prediction_data,
calculate_matrix,
calculate_metrics,
)
from evidently.legacy.metric_results import DatasetColumns, DatasetUtilityColumns
from evidently.legacy.pipeline.column_mapping import ColumnMapping
# Build dataset columns metadata
columns = DatasetColumns(
utility_columns=DatasetUtilityColumns(
date=None, id=None, target="target", prediction="prediction"
),
target_type="cat",
num_feature_names=[],
cat_feature_names=[],
text_feature_names=[],
datetime_feature_names=[],
target_names=None,
task="classification",
)
# Extract prediction data
data = pd.DataFrame({"target": [1, 0, 1, 0], "prediction": [0.8, 0.3, 0.6, 0.2]})
pred_data = get_prediction_data(data, columns, pos_label=1, threshold=0.5)
# Compute confusion matrix
conf_matrix = calculate_matrix(data["target"], pred_data.predictions, labels=pred_data.labels)
# Compute full classification quality
mapping = ColumnMapping(pos_label=1)
quality = calculate_metrics(mapping, conf_matrix, data["target"], pred_data)
print(quality.accuracy, quality.f1)
Related Pages
- Environment:Evidentlyai_Evidently_Python_Core_Environment
- Evidentlyai_Evidently_Legacy_Base_Metric -- Base classes that classification metrics extend
- Evidentlyai_Evidently_Legacy_Metric_Results -- Data models (PredictionData, ConfusionMatrix, DatasetClassificationQuality) used by these calculations
- Evidentlyai_Evidently_Legacy_HTML_Widgets -- Rendering functions for classification visualizations