Principle:Online ml River Online Linear Models
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| Machine Learning Optimization | Online_Learning, Linear_Models, Classification | 2026-02-08 18:00 GMT |
Overview
Online linear models are a family of linear classifiers and regressors that update their weight vectors incrementally with each new observation. This family includes classical algorithms such as the Perceptron, Passive-Aggressive methods, ALMA, Bayesian linear regression, and softmax regression, each with distinct update rules and theoretical motivations.
Description
Linear models predict by computing a weighted sum of input features: . In the online setting, the weight vector is updated after each observation, making these models naturally suited to streaming data. Different update rules yield different algorithms with distinct properties:
Perceptron: The simplest online linear classifier. It updates weights only when a misclassification occurs, adding or subtracting the input vector scaled by the learning rate. Despite its simplicity, the Perceptron convergence theorem guarantees that it will find a separating hyperplane for linearly separable data in finite steps.
Passive-Aggressive (PA): A margin-based online learning algorithm that makes the smallest weight update necessary to correctly classify the current instance with a specified margin. When the prediction is correct with sufficient margin, the model is "passive" (no update). When the margin is violated, the model is "aggressive" (updates as much as needed).
ALMA (Approximate Large Margin Algorithm): An online algorithm that approximates the maximum margin classifier. It maintains a normalized weight vector and updates it based on margin violations with a diminishing learning rate, providing theoretical guarantees about the margin achieved.
Bayesian Linear Regression: Maintains a full posterior distribution over the weights rather than a point estimate. Each new observation updates the posterior via Bayes' rule, providing not just predictions but also uncertainty estimates. The prior and likelihood are both Gaussian, yielding closed-form posterior updates.
Softmax Regression: Extends logistic regression to the multi-class setting. Each class has its own weight vector, and predictions are made via the softmax function. Weights are updated incrementally using the cross-entropy loss gradient.
Usage
Use online linear models when:
- You need a lightweight, interpretable model for streaming data.
- You want well-understood convergence guarantees.
- Memory and computation per instance must be bounded.
- You need uncertainty quantification (Bayesian variant).
Theoretical Basis
Perceptron Update
For each instance (x, y) where y in {-1, +1}:
hat{y} = sign(w^T x)
if hat{y} != y:
w = w + eta * y * x
Convergence: If data is linearly separable with margin and , the Perceptron makes at most mistakes.
Passive-Aggressive Update
For each instance (x, y):
loss = max(0, 1 - y * w^T x) # hinge loss
tau = loss / (||x||^2 + 1/(2*C)) # PA-II variant
w = w + tau * y * x
The PA family minimizes a trade-off between staying close to the current weight vector and satisfying the margin constraint on the current instance.
Bayesian Linear Regression
Prior: w ~ N(mu_0, Sigma_0)
For each instance (x, y):
Posterior update (conjugate Gaussian):
Sigma_n = (Sigma_{n-1}^{-1} + (1/sigma^2) * x * x^T)^{-1}
mu_n = Sigma_n * (Sigma_{n-1}^{-1} * mu_{n-1} + (1/sigma^2) * y * x)
Prediction: hat{y} = mu_n^T x, with variance x^T Sigma_n x
Softmax Regression
For each instance (x, y) with K classes:
score_k = w_k^T x for each class k
p_k = exp(score_k) / sum_j exp(score_j) # softmax
For each class k:
w_k = w_k - eta * (p_k - I(y==k)) * x # cross-entropy gradient
Related Pages
- Implementation:Online_ml_River_Linear_Model_ALMAClassifier
- Implementation:Online_ml_River_Linear_Model_BayesianLinearRegression
- Implementation:Online_ml_River_Linear_Model_PA
- Implementation:Online_ml_River_Linear_Model_Perceptron
- Implementation:Online_ml_River_Linear_Model_SoftmaxRegression
- Principle:Online_ml_River_Online_Logistic_Regression
- Principle:Online_ml_River_Online_Linear_Regression