Principle:Online ml River Streaming Accuracy Measurement

Knowledge Sources	River River Docs
Domains	Online_Learning Evaluation Classification
Last Updated	2026-02-08 16:00 GMT

Overview

Streaming accuracy measurement is the incremental computation of classification accuracy as the ratio of correct predictions to total predictions, updated one observation at a time via a streaming confusion matrix.

Description

Accuracy is the most intuitive classification metric: it measures the fraction of predictions that exactly match the true labels. In batch machine learning, accuracy is computed after all predictions are made. In online (streaming) machine learning, accuracy must be computed incrementally, updating the running score as each new prediction-label pair arrives.

River implements streaming accuracy through an incrementally updated confusion matrix. Rather than storing all predictions and labels, the confusion matrix maintains running counts of true positives, true negatives, false positives, and false negatives for each class. When a new prediction-label pair arrives, only the relevant cell of the confusion matrix is incremented. The accuracy is then computed on demand by dividing the total number of correct predictions (sum of the diagonal) by the total weight of all observations.

This approach has several advantages:

O(1) memory per class: only the confusion matrix cells are stored.
O(1) update time: each new observation updates a single cell.
O(k) query time: computing accuracy requires summing the diagonal (k classes), but for binary classification this is O(1).
Support for weighted observations: the confusion matrix can accumulate weights rather than counts, enabling importance-weighted accuracy.

Usage

Use streaming accuracy measurement when:

You need a simple, interpretable metric for classification performance in an online learning setting.
You are using evaluate.progressive_val_score to evaluate a model and need to pass a metric object.
Class distributions are roughly balanced (accuracy can be misleading for imbalanced datasets).
You want to monitor model performance in real time as data arrives.

Theoretical Basis

Definition:

Accuracy = total_true_positives / total_weight

Where:

total_true_positives is the sum of the diagonal of the confusion matrix, i.e., the total weight of all correctly classified observations across all classes.
total_weight is the total weight of all observations processed so far.

Incremental update: When a new observation $(y_{true}, y_{pred}, w)$ arrives:

confusion_matrix[y_true][y_pred] += w

The accuracy is not recomputed from scratch; instead, the get() method reads directly from the confusion matrix's cached totals.

For binary classification:

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Where TP = true positives, TN = true negatives, FP = false positives, FN = false negatives.

Relationship to error rate:

Error Rate = 1 - Accuracy

Limitation: Accuracy treats all misclassifications equally and can be misleading for imbalanced datasets. For example, a dataset with 95% negative samples achieves 95% accuracy with a trivial "always predict negative" classifier. In such cases, metrics like ROCAUC, F1-score, or balanced accuracy are more informative.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment