Implementation:Online ml River Linear Model LogisticRegression
| Knowledge Sources | River River Docs |
|---|---|
| Domains | Online_Learning Classification Optimization |
| Last Updated | 2026-02-08 16:00 GMT |
Overview
Concrete tool for performing online binary classification via logistic regression with stochastic gradient descent, supporting pluggable optimizers, L1/L2 regularization, and mini-batch learning.
Description
The linear_model.LogisticRegression class implements logistic regression as a specialization of River's Generalized Linear Model (GLM) base class. It uses the log-loss (binary cross-entropy) as the loss function and the sigmoid function as the mean function to convert raw linear scores into probabilities.
The model maintains a weight vector (self._weights) and an intercept (self.intercept). For each training example, it computes the gradient of the log-loss with respect to the weights and updates them using the configured optimizer. The intercept is updated separately using a dedicated learning rate.
Key capabilities:
- Pluggable optimizers: Defaults to
optim.SGD(0.01)but accepts any optimizer from River'soptimmodule (Adam, AdaGrad, RMSProp, etc.). - L1 regularization: Uses a cumulative penalty approach for sparse weight vectors.
- L2 regularization: Adds a weight decay term to the gradient.
- Gradient clipping: Clamps gradient values to
[-clip_gradient, clip_gradient]to prevent exploding gradients. - Mini-batch support: Provides
learn_many,predict_many, andpredict_proba_manyfor DataFrame inputs.
The class inherits from both linear_model.base.GLM (for the SGD mechanics) and base.MiniBatchClassifier (for mini-batch interface compliance).
Usage
Import this class when you need to:
- Perform binary classification on streaming data with a linear model.
- Obtain probability estimates for each class.
- Experiment with different optimizers, learning rates, or regularization strengths.
- Combine with a
StandardScalerin a pipeline for best convergence behavior.
Code Reference
Source Location
| File | Lines |
|---|---|
river/linear_model/log_reg.py |
L8-L99 |
river/linear_model/base.py (GLM base) |
L14-L193 |
Signature
class LogisticRegression(linear_model.base.GLM, base.MiniBatchClassifier):
def __init__(
self,
optimizer: optim.base.Optimizer | None = None,
loss: optim.losses.BinaryLoss | None = None,
l2=0.0,
l1=0.0,
intercept_init=0.0,
intercept_lr: float | optim.base.Scheduler = 0.01,
clip_gradient=1e12,
initializer: optim.base.Initializer | None = None,
)
# Inherited from GLM
def learn_one(self, x: dict, y, w=1.0) -> None
def learn_many(self, X: pd.DataFrame, y: pd.Series, w=1) -> None
# Classification methods
def predict_proba_one(self, x: dict) -> dict
def predict_proba_many(self, X: pd.DataFrame) -> pd.DataFrame
Import
from river import linear_model
model = linear_model.LogisticRegression()
I/O Contract
Inputs
| Parameter | Type | Default | Description |
|---|---|---|---|
optimizer |
None | optim.SGD(0.01) |
Sequential optimizer for weight updates. |
loss |
None | optim.losses.Log() |
Loss function to optimize. Defaults to log-loss. |
l2 |
float |
0.0 |
L2 regularization strength. |
l1 |
float |
0.0 |
L1 regularization strength. |
intercept_init |
float |
0.0 |
Initial intercept value. |
intercept_lr |
optim.base.Scheduler | 0.01 |
Learning rate for the intercept update. |
clip_gradient |
float |
1e12 |
Maximum absolute value for gradient clipping. |
initializer |
None | optim.initializers.Zeros() |
Weight initialization scheme. |
x (to learn_one) |
dict |
(required) | Feature dictionary. |
y (to learn_one) |
bool |
(required) | Binary target label. |
Outputs
| Method | Return Type | Description |
|---|---|---|
predict_proba_one(x) |
{False: float, True: float} |
Dictionary mapping each class to its predicted probability. |
predict_one(x) |
bool |
The class with the highest predicted probability. |
predict_proba_many(X) |
pd.DataFrame |
DataFrame with columns False and True, one row per sample.
|
Usage Examples
Basic binary classification:
from river import datasets, linear_model, metrics, preprocessing
model = preprocessing.StandardScaler() | linear_model.LogisticRegression()
metric = metrics.Accuracy()
for x, y in datasets.Phishing():
y_pred = model.predict_one(x)
metric.update(y, y_pred)
model.learn_one(x, y)
print(metric)
# Accuracy: 88.96%
With custom optimizer and regularization:
from river import linear_model, optim
model = linear_model.LogisticRegression(
optimizer=optim.SGD(0.1),
l2=0.001,
clip_gradient=1.0,
)
Getting probability estimates:
from river import datasets, linear_model, preprocessing
model = preprocessing.StandardScaler() | linear_model.LogisticRegression()
for x, y in datasets.Phishing():
proba = model.predict_proba_one(x)
# proba = {False: 0.45, True: 0.55}
model.learn_one(x, y)
break
With progressive validation:
from river import datasets, evaluate, linear_model, metrics, optim, preprocessing
dataset = datasets.Phishing()
model = preprocessing.StandardScaler() | linear_model.LogisticRegression(optimizer=optim.SGD(0.1))
metric = metrics.Accuracy()
evaluate.progressive_val_score(dataset, model, metric)
# Accuracy: 88.96%