Principle:Evidentlyai Evidently ML Task Configuration
| Knowledge Sources | |
|---|---|
| Domains | ML_Evaluation, Classification, Regression |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
A configuration mechanism that maps ML task-specific columns (targets, predictions, probabilities) for correct model quality evaluation.
Description
ML Task Configuration defines how an ML model's inputs and outputs are represented in a dataset by mapping target columns, prediction columns, and probability columns to specific ML task types. Evidently supports three core task types:
- Binary Classification: Maps target labels, predicted labels, predicted probabilities, and positive label value
- Multiclass Classification: Maps target labels, predicted class labels, and per-class probability columns
- Regression: Maps target values and predicted values
These configurations are passed into DataDefinition and drive which quality metrics can be computed (e.g., accuracy, F1, ROC AUC for classification; MAE, RMSE, R2 for regression). Without proper task configuration, model quality presets like ClassificationQuality and RegressionQuality cannot resolve which columns to evaluate.
Usage
Use this principle when evaluating ML model quality. It is required whenever using classification or regression quality presets and metrics. Apply it after data loading but before creating Evidently Datasets.
Theoretical Basis
ML task configuration follows the column-role binding pattern where abstract task roles (target, prediction) are bound to concrete column names:
# Pseudocode: Task role binding
task = define_task(
type="binary_classification",
target_column="actual_label",
prediction_column="predicted_label",
probability_column="predicted_proba",
positive_class=1
)
# The evaluation engine uses this binding to compute metrics
accuracy = count(target == prediction) / total
roc_auc = compute_auc(target, probabilities)
The key design insight is that the same dataset can define multiple tasks simultaneously (e.g., one binary classification and one regression), each evaluated independently.