Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Evidentlyai Evidently Legacy Classification Calculations

From Leeroopedia
Knowledge Sources
Domains ML Monitoring, Classification
Last Updated 2026-02-14 12:00 GMT

Overview

Provides core calculation functions for evaluating classification model performance, including confusion matrix decomposition, prediction extraction with threshold handling, precision/recall/lift table generation, and comprehensive quality metric computation (accuracy, precision, recall, F1, ROC AUC, log loss).

Description

This module contains stateless functions used by Evidently's legacy classification metrics to compute performance statistics from pandas DataFrames. It handles both binary and multiclass classification scenarios, with special attention to probability-based predictions and threshold tuning.

Key functions:

  • calculate_confusion_by_classes -- Decomposes a confusion matrix into per-class TP, TN, FP, and FN counts.
  • get_prediction_data -- The most complex function in the module. It interprets the prediction column(s) from a DataFrame according to various classification scenarios:
    • Multiclass: columns are class probabilities; argmax yields predicted label.
    • Binary with two probability columns: applies a threshold to the positive label column.
    • Binary with a single probability column and string/integer targets: infers positive and negative labels, constructs a full probability DataFrame, and applies the threshold.
    • Non-probabilistic: returns raw prediction values.
  • k_probability_threshold -- Computes the probability cutoff at the k-th ranked observation or at a fractional percentile.
  • threshold_probability_labels -- Converts probability values to class labels using a threshold.
  • calculate_pr_table -- Builds a precision-recall table at 5% step intervals.
  • calculate_lift_table -- Builds a lift table at 1% step intervals, including lift, max lift, relative lift, and F1.
  • calculate_matrix -- Wraps scikit-learn's confusion_matrix and returns a typed ConfusionMatrix result.
  • collect_plot_data -- Extracts box-plot statistics (min, 25%, 50%, 75%, max) from prediction probability columns.
  • calculate_metrics -- The main entry point that assembles a DatasetClassificationQuality object containing accuracy, precision, recall, F1, and optionally ROC AUC, log loss, TPR/TNR/FPR/FNR, rate plot data, and box plot data.

Usage

These functions are called internally by Evidently classification metrics (e.g., ClassificationQualityMetric, ClassificationConfusionMatrix). They can also be used standalone for computing classification statistics from pandas DataFrames with scikit-learn-compatible target/prediction columns.

Code Reference

Source Location

Signature

def calculate_confusion_by_classes(
    confusion_matrix: np.ndarray, class_names: Sequence[Union[str, int, None]]
) -> Dict[Label, Dict[str, int]]: ...

def get_prediction_data(
    data: pd.DataFrame, data_columns: DatasetColumns, pos_label: Optional[Union[str, int]], threshold: float = 0.5
) -> PredictionData: ...

def k_probability_threshold(
    prediction_probas: pd.DataFrame, k: Optional[int] = None, prob_threshold: Optional[float] = None
) -> float: ...

def threshold_probability_labels(
    prediction_probas: pd.DataFrame, pos_label: Union[str, int], neg_label: Union[str, int], threshold: float
) -> pd.Series: ...

def calculate_pr_table(binded) -> list: ...

def calculate_lift_table(binded) -> list: ...

def calculate_matrix(
    target: pd.Series, prediction: pd.Series, labels: List[Label]
) -> ConfusionMatrix: ...

def collect_plot_data(prediction_probas: pd.DataFrame) -> Boxes: ...

def calculate_metrics(
    column_mapping: ColumnMapping,
    confusion_matrix: ConfusionMatrix,
    target: pd.Series,
    prediction: PredictionData,
) -> DatasetClassificationQuality: ...

Import

from evidently.legacy.calculations.classification_performance import (
    calculate_confusion_by_classes,
    get_prediction_data,
    k_probability_threshold,
    threshold_probability_labels,
    calculate_pr_table,
    calculate_lift_table,
    calculate_matrix,
    collect_plot_data,
    calculate_metrics,
)

I/O Contract

Inputs

Name Type Required Description
data pd.DataFrame Yes Source dataset containing target and prediction columns.
data_columns DatasetColumns Yes Metadata describing which columns are target, prediction, features, etc.
pos_label Optional[Union[str, int]] No Positive class label for binary classification. Required for binary tasks.
threshold float No Probability threshold for converting probabilities to labels (default 0.5).
column_mapping ColumnMapping Yes (for calculate_metrics) Column mapping providing pos_label and other settings.
confusion_matrix ConfusionMatrix Yes (for calculate_metrics) Pre-computed confusion matrix.
target pd.Series Yes (for calculate_metrics) Ground-truth labels.
prediction PredictionData Yes (for calculate_metrics) Predicted labels and optional probabilities.
binded list Yes (for PR/lift tables) List of (label, probability) tuples for ranked evaluation.

Outputs

Name Type Description
PredictionData PredictionData Structured predictions with labels and optional probability DataFrame.
ConfusionMatrix ConfusionMatrix Labeled confusion matrix with sorted labels.
DatasetClassificationQuality DatasetClassificationQuality Full classification quality metrics including accuracy, precision, recall, F1, ROC AUC, log loss, TPR/TNR/FPR/FNR, and plot data.
Boxes Boxes Box-plot statistics (min, Q1, median, Q3, max) for probability distributions.
confusion_by_classes Dict[Label, Dict[str, int]] Per-class TP/TN/FP/FN counts.

Usage Examples

import pandas as pd
import numpy as np
from evidently.legacy.calculations.classification_performance import (
    get_prediction_data,
    calculate_matrix,
    calculate_metrics,
)
from evidently.legacy.metric_results import DatasetColumns, DatasetUtilityColumns
from evidently.legacy.pipeline.column_mapping import ColumnMapping

# Build dataset columns metadata
columns = DatasetColumns(
    utility_columns=DatasetUtilityColumns(
        date=None, id=None, target="target", prediction="prediction"
    ),
    target_type="cat",
    num_feature_names=[],
    cat_feature_names=[],
    text_feature_names=[],
    datetime_feature_names=[],
    target_names=None,
    task="classification",
)

# Extract prediction data
data = pd.DataFrame({"target": [1, 0, 1, 0], "prediction": [0.8, 0.3, 0.6, 0.2]})
pred_data = get_prediction_data(data, columns, pos_label=1, threshold=0.5)

# Compute confusion matrix
conf_matrix = calculate_matrix(data["target"], pred_data.predictions, labels=pred_data.labels)

# Compute full classification quality
mapping = ColumnMapping(pos_label=1)
quality = calculate_metrics(mapping, conf_matrix, data["target"], pred_data)
print(quality.accuracy, quality.f1)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment