Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Datasets AirlinePassengers

From Leeroopedia


Knowledge Sources Domains Last Updated
River River Docs Online Machine Learning, Time Series Forecasting 2026-02-08 16:00 GMT

Overview

Concrete tool for loading the AirlinePassengers and WaterFlow benchmark time series datasets as sequential observation streams for online forecasting evaluation.

Description

The datasets.AirlinePassengers and datasets.WaterFlow classes provide built-in time series datasets that yield observations one at a time in chronological order. Both classes inherit from base.FileDataset and implement the iterator protocol via __iter__, which internally uses stream.iter_csv to parse the underlying CSV file with appropriate type conversions and date parsing.

AirlinePassengers contains 144 monthly observations of international airline passenger totals (in thousands) from January 1949 to December 1960. It has 1 feature (month as a datetime.date object) and an integer target representing the number of passengers.

WaterFlow contains 1,268 hourly observations of water flow (in liters per second) through a pipeline branch from March to May 2022. It has 1 feature (Time as a timezone-aware datetime) and a float target representing the flow rate. The dataset includes four anomalous segments suitable for testing forecaster robustness.

Usage

Import and iterate these datasets when you need a benchmark time series stream for evaluating online forecasting models such as SNARIMAX or HoltWinters.

Code Reference

Source Location

  • river/datasets/airline_passengers.py:L8-L35
  • river/datasets/water_flow.py:L8-L40

Signature

class AirlinePassengers(base.FileDataset):
    def __init__(self) -> None

class WaterFlow(base.FileDataset):
    def __init__(self) -> None

Import

from river import datasets

I/O Contract

Inputs

Parameter Type Description
(none) Both constructors take no parameters

Outputs

Output Type Description
Iterator element (x: dict, y: number) Each iteration yields a tuple of feature dict and target value
AirlinePassengers x {"month": datetime.date} Parsed date for the month
AirlinePassengers y int Number of passengers (thousands)
WaterFlow x {"Time": datetime} Timezone-aware datetime
WaterFlow y float Water flow rate in liters per second

Dataset metadata:

Dataset n_samples n_features Target Type Temporal Resolution
AirlinePassengers 144 1 int Monthly
WaterFlow 1,268 1 float Hourly

Usage Examples

Iterating over AirlinePassengers

from river import datasets

dataset = datasets.AirlinePassengers()

for x, y in dataset:
    print(x["month"], y)
    # x = {"month": datetime.date(1949, 1, 1)}, y = 112
    break

Iterating over WaterFlow

from river import datasets

dataset = datasets.WaterFlow()

for x, y in dataset:
    print(x["Time"], y)
    break

Using with a forecasting model

from river import datasets
from river import time_series

dataset = datasets.AirlinePassengers()
model = time_series.SNARIMAX(p=12, d=1, q=12, m=12, sd=1)

for x, y in dataset:
    model.learn_one(y)

forecast = model.forecast(horizon=12)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment