Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Optim Averager

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Optimization
Last Updated 2026-02-08 16:00 GMT

Overview

Averager is a wrapper optimizer that returns averaged weights from any base stochastic gradient descent optimizer.

Description

The Averager optimizer wraps another optimizer and maintains a running average of the weights produced during training. Unlike traditional weight averaging which typically occurs only at the end of training, this implementation continuously returns the current averaged weights. The averaging process can be delayed by a specified number of iterations using the start parameter, allowing the model to train normally for an initial period before averaging begins. This technique helps reduce variance in the weights and often leads to better generalization. The averaged weights represent a more stable solution by smoothing out the oscillations that can occur during stochastic gradient descent.

Usage

Import from river.optim and wrap any base optimizer with Averager. Useful for improving stability and generalization of any SGD-based optimizer.

Code Reference

Source Location

Signature

class Averager(optim.base.Optimizer):
    def __init__(self, optimizer: optim.base.Optimizer, start: int = 0):
        ...

Import

from river import optim

I/O Contract

Inputs

Name Type Required Description
optimizer optim.base.Optimizer Yes Base optimizer whose weights will be averaged
start int No (default=0) Number of iterations to wait before starting the average

Outputs

Name Type Description
optimizer Averager Wrapped optimizer that returns averaged weights

Usage Examples

from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
from river import optim
from river import preprocessing

# Wrap SGD with averaging, start after 100 iterations
optimizer = optim.Averager(optim.SGD(0.01), start=100)

# Use with a linear model
dataset = datasets.Phishing()
model = (
    preprocessing.StandardScaler() |
    linear_model.LogisticRegression(optimizer)
)
metric = metrics.F1()

# Evaluate
score = evaluate.progressive_val_score(dataset, model, metric)
print(score)  # F1: 87.97%

# Can wrap any optimizer
optimizer = optim.Averager(optim.Adam(), start=50)
model = linear_model.LogisticRegression(optimizer)

# Start averaging immediately
optimizer = optim.Averager(optim.Momentum(lr=0.01))
model = linear_model.LogisticRegression(optimizer)

# Useful for stabilizing training
base_optim = optim.SGD(lr=0.1)  # Higher learning rate
averaged_optim = optim.Averager(base_optim, start=200)
model = linear_model.LogisticRegression(averaged_optim)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment