Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Online ml River Evaluate Progressive Val Score

From Leeroopedia


Knowledge Sources River River Docs Beating the Hold-Out: Bounds for K-fold and Progressive Cross-Validation
Domains Online_Learning Evaluation Classification
Last Updated 2026-02-08 16:00 GMT

Overview

Concrete tool for evaluating online learning models using the progressive validation (test-then-train) protocol, returning a single metric result after processing the entire dataset.

Description

The evaluate.progressive_val_score function is the canonical way to evaluate an online learning model in River. It implements the predict-then-learn protocol: for each observation in the dataset, the model first makes a prediction, the metric is updated with the prediction and the true target, and then the model is trained on the observation. This ensures that the model is always evaluated on unseen data.

Under the hood, the function is implemented on top of evaluate.iter_progressive_val_score. It consumes the entire generator, optionally printing intermediate results at intervals specified by print_every, and returns the final metric object.

Key capabilities:

  • Delayed feedback: The moment and delay parameters simulate real-world scenarios where labels arrive after predictions. When specified, stream.simulate_qa is used to reorder observations into a question-answer sequence.
  • Progress printing: Setting print_every=N prints the metric state every N observations. The show_time and show_memory flags add elapsed time and memory usage to the output.
  • Active learning support: When the model is an active learner, the function tracks how many labels were actually used for training.
  • Automatic prediction method selection: The function automatically chooses between predict_one, predict_proba_one, and score_one depending on the model type and metric requirements.

Usage

Import this function when you need to:

  • Evaluate an online model on a streaming dataset and obtain a single metric result.
  • Get the standard, canonical evaluation for an online learning experiment.
  • Monitor progress during evaluation with periodic printing.
  • Simulate delayed feedback scenarios.

Code Reference

Source Location

File Lines
river/evaluate/progressive_validation.py L231-L409

Signature

def progressive_val_score(
    dataset: base.typing.Dataset,
    model,
    metric: metrics.base.Metric,
    moment: str | typing.Callable | None = None,
    delay: str | int | dt.timedelta | typing.Callable | None = None,
    print_every=0,
    show_time=False,
    show_memory=False,
    **print_kwargs,
) -> metrics.base.Metric

Import

from river import evaluate

result = evaluate.progressive_val_score(dataset, model, metric)

I/O Contract

Inputs

Parameter Type Default Description
dataset base.typing.Dataset (required) Iterable stream of (x, y) tuples or (x, y, kwargs) tuples.
model Estimator (required) The model to evaluate. Must support learn_one and at least one prediction method.
metric metrics.base.Metric (required) The metric used to evaluate predictions. Updated in-place.
moment Callable | None None Attribute or function for measuring time (for delayed feedback). If None, observations are processed in order.
delay int | timedelta | Callable | None None Amount to wait before revealing labels. If None, no delay (standard progressive validation).
print_every int 0 Print metric state every N observations. 0 disables printing.
show_time bool False Whether to display elapsed time in progress output.
show_memory bool False Whether to display model memory usage in progress output.
**print_kwargs Additional keyword arguments passed to Python's print function (e.g., file=f for file output).

Outputs

Output Type Description
Return value metrics.base.Metric The metric object, updated with all observations from the dataset. Call metric.get() for the numeric value or str(metric) for formatted output.

Usage Examples

Basic progressive validation:

from river import datasets, evaluate, linear_model, metrics, preprocessing

dataset = datasets.Phishing()
model = preprocessing.StandardScaler() | linear_model.LogisticRegression()
metric = metrics.Accuracy()

result = evaluate.progressive_val_score(dataset, model, metric)
print(result)
# Accuracy: 88.96%

With progress printing:

from river import datasets, evaluate, linear_model, metrics, preprocessing

model = preprocessing.StandardScaler() | linear_model.LogisticRegression()

evaluate.progressive_val_score(
    model=model,
    dataset=datasets.Phishing(),
    metric=metrics.ROCAUC(),
    print_every=200,
)
# [200] ROCAUC: 90.20%
# [400] ROCAUC: 92.25%
# [600] ROCAUC: 93.23%
# [800] ROCAUC: 94.05%
# [1,000] ROCAUC: 94.79%
# [1,200] ROCAUC: 95.07%
# [1,250] ROCAUC: 95.07%
# ROCAUC: 95.07%

Equivalent manual loop:

from river import datasets, linear_model, metrics, preprocessing

model = preprocessing.StandardScaler() | linear_model.LogisticRegression()
metric = metrics.ROCAUC()

for x, y in datasets.Phishing():
    y_pred = model.predict_proba_one(x)
    metric.update(y, y_pred)
    model.learn_one(x, y)

print(metric)
# ROCAUC: 95.07%

Logging progress to a file:

from river import datasets, evaluate, linear_model, metrics, preprocessing

model = preprocessing.StandardScaler() | linear_model.LogisticRegression()

with open('progress.log', 'w') as f:
    evaluate.progressive_val_score(
        model=model,
        dataset=datasets.Phishing(),
        metric=metrics.ROCAUC(),
        print_every=200,
        file=f,
    )

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment