Implementation:Online ml River Time Series Evaluate Func
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs | Online Machine Learning, Time Series Forecasting, Model Evaluation | 2026-02-08 16:00 GMT |
Overview
Concrete tool for evaluating online time series forecasters using walk-forward validation with per-horizon-step metrics.
Description
The time_series.evaluate function runs a complete walk-forward evaluation of a forecaster on a time series dataset. It iterates through the dataset sequentially, at each step producing a multi-step forecast, comparing it against true future values, and updating per-horizon-step metrics. The model is updated with the current observation after evaluation to prevent information leakage.
Internally, the function delegates to iter_evaluate, which yields intermediate results at each step. The evaluate function simply consumes this iterator to completion and returns the final HorizonMetric.
The evaluation uses a look-ahead buffer (_iter_with_horizon) that maintains a sliding window of h future observations, enabling comparison of forecasts against true values at each step.
Usage
Import time_series.evaluate when you need to assess a forecaster's accuracy across multiple horizon steps on a streaming dataset.
Code Reference
Source Location
river/time_series/evaluate.py:L127-L169
Signature
def evaluate(
dataset: base.typing.Dataset,
model: time_series.base.Forecaster,
metric: metrics.base.RegressionMetric,
horizon: int,
agg_func: typing.Callable[[list[float]], float] | None = None,
grace_period: int | None = None,
) -> time_series.HorizonMetric
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| dataset | Dataset | (required) | A sequential time series stream (iterable of (x, y) tuples) |
| model | Forecaster | (required) | An online time series forecaster |
| metric | RegressionMetric | (required) | A regression metric (e.g., MAE, MSE, RMSE) |
| horizon | int | (required) | Number of steps ahead to forecast at each evaluation step |
| agg_func | Callable or None | None | Optional function to aggregate per-step metrics into a scalar (e.g., statistics.mean) |
| grace_period | int or None | None | Number of initial warmup steps to skip before updating metrics; defaults to horizon if None |
Import
from river import time_series
I/O Contract
Inputs
| Parameter | Type | Description |
|---|---|---|
| dataset | Iterable[(dict, number)] | Time series stream yielding (features, target) tuples |
| model | Forecaster | Online forecaster implementing learn_one and forecast |
| metric | RegressionMetric | Metric to compute at each horizon step (e.g., metrics.MAE()) |
| horizon | int | Forecast horizon length |
| agg_func | Callable or None | Optional aggregation over horizon steps |
| grace_period | int or None | Warmup period (defaults to horizon) |
Outputs
| Return Type | Description |
|---|---|
| HorizonMetric | Contains per-step metric values; get() returns list[float] of metric values per horizon step
|
| HorizonAggMetric | If agg_func is provided; get() returns a single float
|
Usage Examples
Per-horizon evaluation with MAE
from river import datasets
from river import metrics
from river import time_series
dataset = datasets.AirlinePassengers()
model = time_series.HoltWinters(
alpha=0.3,
beta=0.1,
gamma=0.6,
seasonality=12,
multiplicative=True,
)
metric = time_series.evaluate(
dataset,
model,
metric=metrics.MAE(),
horizon=12,
)
# Prints per-step MAE:
# +1 MAE: 25.899087
# +2 MAE: 26.26131
# ...
# +12 MAE: 33.975057
print(metric)
Aggregated evaluation with mean
import statistics
from river import datasets
from river import metrics
from river import time_series
metric = time_series.evaluate(
dataset=datasets.AirlinePassengers(),
model=time_series.HoltWinters(alpha=0.1),
metric=metrics.MAE(),
agg_func=statistics.mean,
horizon=4,
)
# Prints aggregated result: mean(MAE): 42.901748
print(metric)
Evaluation with custom grace period
from river import datasets
from river import metrics
from river import time_series
metric = time_series.evaluate(
dataset=datasets.AirlinePassengers(),
model=time_series.SNARIMAX(p=12, d=1, q=12, m=12, sd=1),
metric=metrics.MAE(),
horizon=12,
grace_period=24, # Skip first 24 steps for warmup
)
print(metric)