Implementation:Online ml River Optim NesterovMomentum

Knowledge Sources	Online_ml_River
Domains	Online_Learning, Optimization
Last Updated	2026-02-08 16:00 GMT

Overview

Nesterov Momentum is an improved variant of momentum that computes gradients at the look-ahead position for better convergence properties.

Description

Nesterov Momentum, also known as Nesterov Accelerated Gradient (NAG), improves upon standard momentum by computing the gradient not at the current position, but at the approximate future position where momentum would take us. This look-ahead mechanism allows the algorithm to make more informed updates by incorporating information about where we're heading. The implementation uses the look_ahead method to temporarily move weights forward by the momentum term before gradient computation, then steps back and applies the full update. This subtle change often results in faster convergence and better performance than standard momentum, especially near local minima where it can slow down more gracefully.

Usage

Import from river.optim and use as an optimizer in any River model. Prefer over standard Momentum when you want improved convergence properties.

Code Reference

Source Location

Repository: Online_ml_River
File: river/optim/nesterov.py

Signature

class NesterovMomentum(optim.base.Optimizer):
    def __init__(self, lr=0.1, rho=0.9):
        ...

    def look_ahead(self, w):
        ...

Import

from river import optim

I/O Contract

Inputs

Name	Type	Required	Description
lr	float	No (default=0.1)	Learning rate
rho	float	No (default=0.9)	Momentum parameter (fraction of previous update to retain)

Outputs

Name	Type	Description
optimizer	NesterovMomentum	Configured optimizer instance ready for model training

Usage Examples

from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
from river import optim
from river import preprocessing

# Create Nesterov Momentum optimizer
optimizer = optim.NesterovMomentum()

# Use with a linear model
dataset = datasets.Phishing()
model = (
    preprocessing.StandardScaler() |
    linear_model.LogisticRegression(optimizer)
)
metric = metrics.F1()

# Evaluate
score = evaluate.progressive_val_score(dataset, model, metric)
print(score)  # F1: 84.22%

# Custom parameters
optimizer = optim.NesterovMomentum(lr=0.05, rho=0.95)
model = linear_model.LogisticRegression(optimizer)

# Compare with standard Momentum
momentum = optim.Momentum(lr=0.1, rho=0.9)
nesterov = optim.NesterovMomentum(lr=0.1, rho=0.9)

model1 = linear_model.LogisticRegression(momentum)
model2 = linear_model.LogisticRegression(nesterov)

# Typically converges faster than standard momentum
optimizer = optim.NesterovMomentum(lr=0.01, rho=0.9)
model = linear_model.LinearRegression(optimizer)

Related Pages

Environment:Online_ml_River_Python_Runtime_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment