Principle:Online ml River Online Pattern Adaptation
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| River River Docs | Online Machine Learning, Time Series Forecasting, Concept Drift | 2026-02-08 16:00 GMT |
Overview
Continuous model adaptation mechanism where forecaster parameters evolve with each new observation, automatically adjusting to changing patterns in the time series.
Description
Online time series forecasters in River are inherently adaptive: their internal parameters are updated with every new observation, causing the model to continuously track evolving patterns in the data. This adaptation is built into the model formulation rather than requiring explicit drift detection or retraining triggers.
Both SNARIMAX and HoltWinters achieve continuous adaptation through different mechanisms:
- SNARIMAX: The internal regressor (by default,
LinearRegressionwith SGD) updates its weights at each step via stochastic gradient descent. As the autoregressive structure of the series changes, the weights on lag and error features evolve accordingly. The learning rate of the optimizer controls the adaptation speed. - HoltWinters: The exponential smoothing equations inherently weight recent observations more heavily than older ones. The smoothing parameters (alpha, beta, gamma) directly control how quickly each component (level, trend, seasonal) responds to new data. Higher values mean faster adaptation but more noise sensitivity.
This implicit adaptation mechanism means that:
- No explicit concept drift detection is needed
- The model continuously tracks gradual changes in trend, seasonality, and autoregressive structure
- The trade-off between stability and responsiveness is controlled by model hyperparameters
Usage
Understand this pattern when:
- You need to forecast a time series whose patterns may change over time
- You want to know how River forecasters handle non-stationary behavior
- You are tuning hyperparameters that control adaptation speed (learning rate, smoothing parameters)
- You are comparing implicit adaptation (as in River) with explicit drift detection approaches
Theoretical Basis
SNARIMAX Adaptation via SGD
At each time step, SNARIMAX's regressor performs a single SGD update:
1. Construct feature vector x_t from lags and errors
2. Predict: hat{y'}_t = w^T * x_t
3. Compute gradient: g = d/dw Loss(y'_t, hat{y'}_t)
4. Update weights: w_{new} = w_{old} - lr * g
The key adaptation characteristics:
- Learning rate (lr): Controls step size of weight updates. Default is SGD(0.01). Higher values track changes faster but may oscillate; lower values are more stable but slower to adapt.
- Feature construction: As the lag buffer evolves, the features presented to the regressor naturally reflect recent series behavior.
- Error feedback: MA features capture recent prediction errors, providing a feedback signal when the model's assumptions no longer match reality.
HoltWinters Adaptation via Exponential Smoothing
The exponential smoothing update equations inherently implement a recency-weighted average:
Level: l_t = alpha * (current signal) + (1 - alpha) * (previous state)
Trend: b_t = beta * (current signal) + (1 - beta) * (previous state)
Season: s_t = gamma * (current signal) + (1 - gamma) * (previous state)
This exponential weighting means:
- The effective weight of an observation k steps in the past decays as
(1-alpha)^k - alpha close to 1: Level tracks the raw data closely (fast adaptation, more noise)
- alpha close to 0: Level changes slowly (stable, but may lag behind true changes)
- The same trade-off applies to beta (trend adaptation) and gamma (seasonal adaptation)
Implicit vs. Explicit Drift Handling
| Approach | Mechanism | Used By |
|---|---|---|
| Implicit adaptation | Model parameters continuously evolve with each observation | SNARIMAX (SGD), HoltWinters (smoothing) |
| Explicit drift detection | A separate detector (e.g., ADWIN, Page-Hinkley) triggers model reset/retraining | Not required for River forecasters |
The implicit approach is simpler and avoids the overhead of maintaining a drift detector, but it cannot perform sudden model resets in response to abrupt concept shifts. For gradual drift, implicit adaptation is generally sufficient.
Adaptation Speed Trade-offs
Fast adaptation (high lr / high alpha):
+ Quickly tracks pattern changes
- More sensitive to noise and outliers
- May overfit to recent anomalies
Slow adaptation (low lr / low alpha):
+ Stable, smooth predictions
+ Robust to transient anomalies
- Slow to respond to genuine pattern changes
- May underperform after structural breaks