Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Metrics PerOutput

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Evaluation_Metrics, Multi_Output_Learning
Last Updated 2026-02-08 16:00 GMT

Overview

Per-output wrapper maintaining separate metric instances for each output without aggregation.

Description

PerOutput wraps any single-output metric for multi-output problems by maintaining an independent copy of the metric for each output. Unlike MacroAverage (which returns the mean) or MicroAverage (which aggregates all outputs), PerOutput returns a dictionary mapping output IDs to their individual metric instances. This allows detailed inspection of per-output performance without any aggregation or averaging.

Usage

Use PerOutput when you need to track and inspect individual performance for each output separately without aggregation. This is valuable for debugging, identifying problematic outputs, comparing performance across different outputs, or when you need access to full metric objects (not just their values) for each output. The get() method returns a dictionary of metrics rather than a single value.

Code Reference

Source Location

Signature

class PerOutput(MultiOutputMetric, metrics.base.WrapperMetric):
    def __init__(self, metric):
        # metric: Any classification or regression metric
        pass

Import

from river import metrics

I/O Contract

Method Parameters Returns Description
update y_true (dict), y_pred (dict), [w] None Updates all per-output metric copies
get - dict Returns dictionary mapping output IDs to metric instances

Usage Examples

from river import metrics

# Track F1 score separately for each output
per_output_f1 = metrics.multioutput.PerOutput(metrics.F1())

y_true = [
    {0: False, 1: True, 2: True},
    {0: True, 1: True, 2: False},
    {0: True, 1: False, 2: True},
]

y_pred = [
    {0: False, 1: True, 2: True},   # All correct
    {0: True, 1: False, 2: False},  # Label 1 wrong
    {0: False, 1: False, 2: True},  # Label 0 wrong
]

for yt, yp in zip(y_true, y_pred):
    per_output_f1.update(yt, yp)

# Display all outputs
print(per_output_f1)
# 0 - F1: 66.67%
# 1 - F1: 50.00%
# 2 - F1: 100.00%

# Access individual metrics
metrics_dict = per_output_f1.get()
for output_id, metric in metrics_dict.items():
    print(f"Output {output_id}: {metric.get():.2%}")

# Get specific output's metric
output_0_f1 = per_output_f1.metrics[0]
print(f"Output 0 F1: {output_0_f1}")

# Identify best and worst performing outputs
best_output = max(metrics_dict.items(), key=lambda x: x[1].get())
worst_output = min(metrics_dict.items(), key=lambda x: x[1].get())

print(f"Best: Output {best_output[0]} with {best_output[1].get():.2%}")
print(f"Worst: Output {worst_output[0]} with {worst_output[1].get():.2%}")

# Regression example with MAE
per_output_mae = metrics.multioutput.PerOutput(metrics.MAE())

y_true_reg = [
    {0: 1.0, 1: 2.0, 2: 3.0},
    {0: 2.0, 1: 3.0, 2: 4.0},
]

y_pred_reg = [
    {0: 1.1, 1: 2.5, 2: 2.9},  # Output 1 has larger error
    {0: 2.05, 1: 3.3, 2: 4.1},
]

for yt, yp in zip(y_true_reg, y_pred_reg):
    per_output_mae.update(yt, yp)

print("\nPer-output MAE:")
print(per_output_mae)
# Shows individual MAE for each output
# Useful for identifying which outputs are harder to predict

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment