Workflow:Online ml River Binary Classification Pipeline
| Knowledge Sources | |
|---|---|
| Domains | Online_ML, Classification, Streaming_Data |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
End-to-end process for building and evaluating a binary classification model on streaming data using composable pipelines and progressive validation.
Description
This workflow covers the standard procedure for performing binary classification with River on data that arrives one observation at a time. It leverages the composable pipeline system to chain preprocessing transformers (e.g., feature scaling) with a classifier (e.g., logistic regression), then evaluates performance using progressive validation, which is the honest evaluation method for streaming data. The process covers dataset loading, pipeline construction, incremental learning, and real-time metric tracking.
Usage
Execute this workflow when you have a labeled binary classification dataset (or a live data stream producing binary labels) and want to train and evaluate a model incrementally, without storing the entire dataset in memory. This is the canonical starting point for any classification task with River.
Execution Steps
Step 1: Load or Connect to a Data Stream
Obtain a stream of observations, where each observation is a dictionary of features paired with a binary label. River provides built-in datasets (e.g., Phishing, Elec2, CreditCard) that yield (x, y) tuples one at a time. For custom data, use stream utilities to iterate over CSV files, pandas DataFrames, or SQL queries.
Key considerations:
- Each observation is a Python dict of features and a target value
- Built-in datasets are iterators that produce (dict, bool/int) tuples
- For custom data, use river.stream.iter_csv, river.stream.iter_pandas, or river.stream.iter_sql
- Data never needs to fit entirely in memory
Step 2: Construct a Preprocessing Pipeline
Build a composable pipeline by chaining one or more transformers before the classifier. Transformers handle tasks like feature scaling (StandardScaler, MinMaxScaler), encoding (OneHotEncoder), and feature extraction (PolynomialExtender). Use the pipe operator to connect transformers sequentially.
Key considerations:
- Use the pipe operator (|) to chain transformers: scaler | encoder | model
- Unsupervised transformers are updated during learn_one automatically
- Pipelines implement the same learn_one/predict_one interface as individual models
- Feature unions can be created with the plus operator (+)
Step 3: Select and Configure a Classifier
Choose an appropriate binary classifier. For linear boundaries, use LogisticRegression with an optimizer (SGD, Adam, etc.). For non-linear boundaries, use tree-based models (HoeffdingTreeClassifier) or ensemble methods (AdaBoostClassifier, BaggingClassifier). Configure hyperparameters such as learning rate, regularization, or tree growth parameters.
Key considerations:
- LogisticRegression supports configurable optimizers and loss functions
- Tree classifiers (HoeffdingTreeClassifier) grow incrementally using Hoeffding bounds
- Ensemble methods (BaggingClassifier) use Poisson sampling for online bagging
- All classifiers implement predict_one(x) and learn_one(x, y)
Step 4: Choose Evaluation Metrics
Select one or more streaming metrics to track model performance. Binary classification metrics include Accuracy, Precision, Recall, F1, ROCAUC, and LogLoss. Metrics are updated incrementally with each prediction-label pair. Multiple metrics can be combined.
Key considerations:
- Metrics are updated one observation at a time via metric.update(y_true, y_pred)
- Use ROCAUC when class probabilities matter
- Rolling variants allow measuring performance over a sliding window
- ClassificationReport provides a comprehensive multi-metric view
Step 5: Run Progressive Validation
Evaluate the model using progressive validation (prequential evaluation). For each observation: first predict, then update the metric, then train the model. This order prevents data leakage and simulates production conditions. Use evaluate.progressive_val_score for automated evaluation, or implement the loop manually for custom logic.
Key considerations:
- The predict-evaluate-learn order prevents data leakage
- progressive_val_score handles the entire loop with optional time measurement
- iter_progressive_val_score yields intermediate results for plotting
- Supports delayed label arrival via the moment and delay parameters
Step 6: Inspect Results and Iterate
Examine the final metric values and optionally plot learning curves over time. Adjust the pipeline by trying different preprocessing steps, classifiers, or hyperparameters. Compare multiple models using evaluation tracks or by running parallel progressive validations.
Key considerations:
- Learning curves show how performance evolves over time
- Model memory usage can be measured during evaluation
- Use model_selection.BanditClassifier for automated online model comparison
- Pipelines can be cloned and modified for systematic comparison