Principle:Online ml River Horizon Metric

Knowledge Sources	Domains	Last Updated
River River Docs	Online Machine Learning, Time Series Forecasting, Model Evaluation	2026-02-08 16:00 GMT

Overview

Metric wrapper that maintains separate evaluation metric instances for each step in a multi-step forecast horizon.

Description

When evaluating multi-step forecasting models, a single aggregate metric value obscures important information about how accuracy varies with forecast distance. The Horizon Metric principle addresses this by maintaining separate, independent copies of a base regression metric for each step of the forecast horizon.

Given a base metric (e.g., MAE) and a forecast horizon h, the HorizonMetric creates h independent metric instances. When update(y_true, y_pred) is called with lists of true and predicted values, the i-th metric is updated with (y_true[i], y_pred[i]). This produces h independent performance measurements, one for each forecast distance.

For cases where a single summary value is needed (e.g., model selection or hyperparameter tuning), HorizonAggMetric extends HorizonMetric by applying an aggregation function (such as mean, max, or median) to the per-step metric values.

Usage

Use HorizonMetric when:

You need to understand how forecaster accuracy varies across different prediction horizons
You want detailed per-step performance breakdown for diagnostic purposes
You are comparing multiple models and need to identify which performs better at specific horizons

Use HorizonAggMetric when:

You need a single scalar metric for model selection or hyperparameter optimization
You want to summarize overall multi-step performance with a simple aggregate

Theoretical Basis

Per-Step Metric Decomposition

Given a base regression metric M (e.g., MAE, MSE, RMSE) and horizon h, the HorizonMetric maintains:

M_1, M_2, ..., M_h     (h independent copies of M)

At each evaluation step t, the update processes parallel lists:

y_true = [y_{t+1}, y_{t+2}, ..., y_{t+h}]
y_pred = [hat{y}_{t+1}, hat{y}_{t+2}, ..., hat{y}_{t+h}]

For i = 1, 2, ..., h:
    M_i.update(y_true[i], y_pred[i])

The get() method returns:

[M_1.get(), M_2.get(), ..., M_h.get()]

Each M_i accumulates results across all evaluation steps, providing the average (or accumulated) metric value specifically for the i-th step ahead.

Aggregation

HorizonAggMetric applies a function f to the per-step values:

result = f([M_1.get(), M_2.get(), ..., M_h.get()])

Common choices for f:

statistics.mean: Average performance across all horizon steps
max: Worst-case performance across all horizon steps
statistics.median: Median performance, robust to outlier steps

Lazy Initialization

Metric copies are created lazily via metric.clone() on first use. This means the number of internal metrics grows up to h as updates are received, rather than being pre-allocated. This design accommodates variable-length forecast horizons.

Related Pages

Implementation:Online_ml_River_Time_Series_HorizonMetric_Impl

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment