Implementation:Online ml River Metrics PerOutput

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Evaluation_Metrics, Multi_Output_Learning
Last Updated	2026-02-08 16:00 GMT

Overview

Per-output wrapper maintaining separate metric instances for each output without aggregation.

Description

PerOutput wraps any single-output metric for multi-output problems by maintaining an independent copy of the metric for each output. Unlike MacroAverage (which returns the mean) or MicroAverage (which aggregates all outputs), PerOutput returns a dictionary mapping output IDs to their individual metric instances. This allows detailed inspection of per-output performance without any aggregation or averaging.

Usage

Use PerOutput when you need to track and inspect individual performance for each output separately without aggregation. This is valuable for debugging, identifying problematic outputs, comparing performance across different outputs, or when you need access to full metric objects (not just their values) for each output. The get() method returns a dictionary of metrics rather than a single value.

Code Reference

Source Location

Repository: Online_ml_River
File: river/metrics/multioutput/per_output.py

Signature

class PerOutput(MultiOutputMetric, metrics.base.WrapperMetric):
    def __init__(self, metric):
        # metric: Any classification or regression metric
        pass

Import

from river import metrics

I/O Contract

Method	Parameters	Returns	Description
update	y_true (dict), y_pred (dict), [w]	None	Updates all per-output metric copies
get	-	dict	Returns dictionary mapping output IDs to metric instances

Usage Examples

from river import metrics

# Track F1 score separately for each output
per_output_f1 = metrics.multioutput.PerOutput(metrics.F1())

y_true = [
    {0: False, 1: True, 2: True},
    {0: True, 1: True, 2: False},
    {0: True, 1: False, 2: True},
]

y_pred = [
    {0: False, 1: True, 2: True},   # All correct
    {0: True, 1: False, 2: False},  # Label 1 wrong
    {0: False, 1: False, 2: True},  # Label 0 wrong
]

for yt, yp in zip(y_true, y_pred):
    per_output_f1.update(yt, yp)

# Display all outputs
print(per_output_f1)
# 0 - F1: 66.67%
# 1 - F1: 50.00%
# 2 - F1: 100.00%

# Access individual metrics
metrics_dict = per_output_f1.get()
for output_id, metric in metrics_dict.items():
    print(f"Output {output_id}: {metric.get():.2%}")

# Get specific output's metric
output_0_f1 = per_output_f1.metrics[0]
print(f"Output 0 F1: {output_0_f1}")

# Identify best and worst performing outputs
best_output = max(metrics_dict.items(), key=lambda x: x[1].get())
worst_output = min(metrics_dict.items(), key=lambda x: x[1].get())

print(f"Best: Output {best_output[0]} with {best_output[1].get():.2%}")
print(f"Worst: Output {worst_output[0]} with {worst_output[1].get():.2%}")

# Regression example with MAE
per_output_mae = metrics.multioutput.PerOutput(metrics.MAE())

y_true_reg = [
    {0: 1.0, 1: 2.0, 2: 3.0},
    {0: 2.0, 1: 3.0, 2: 4.0},
]

y_pred_reg = [
    {0: 1.1, 1: 2.5, 2: 2.9},  # Output 1 has larger error
    {0: 2.05, 1: 3.3, 2: 4.1},
]

for yt, yp in zip(y_true_reg, y_pred_reg):
    per_output_mae.update(yt, yp)

print("\nPer-output MAE:")
print(per_output_mae)
# Shows individual MAE for each output
# Useful for identifying which outputs are harder to predict

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment