Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Online ml River Optim NesterovMomentum

From Leeroopedia


Knowledge Sources
Domains Online_Learning, Optimization
Last Updated 2026-02-08 16:00 GMT

Overview

Nesterov Momentum is an improved variant of momentum that computes gradients at the look-ahead position for better convergence properties.

Description

Nesterov Momentum, also known as Nesterov Accelerated Gradient (NAG), improves upon standard momentum by computing the gradient not at the current position, but at the approximate future position where momentum would take us. This look-ahead mechanism allows the algorithm to make more informed updates by incorporating information about where we're heading. The implementation uses the look_ahead method to temporarily move weights forward by the momentum term before gradient computation, then steps back and applies the full update. This subtle change often results in faster convergence and better performance than standard momentum, especially near local minima where it can slow down more gracefully.

Usage

Import from river.optim and use as an optimizer in any River model. Prefer over standard Momentum when you want improved convergence properties.

Code Reference

Source Location

Signature

class NesterovMomentum(optim.base.Optimizer):
    def __init__(self, lr=0.1, rho=0.9):
        ...

    def look_ahead(self, w):
        ...

Import

from river import optim

I/O Contract

Inputs

Name Type Required Description
lr float No (default=0.1) Learning rate
rho float No (default=0.9) Momentum parameter (fraction of previous update to retain)

Outputs

Name Type Description
optimizer NesterovMomentum Configured optimizer instance ready for model training

Usage Examples

from river import datasets
from river import evaluate
from river import linear_model
from river import metrics
from river import optim
from river import preprocessing

# Create Nesterov Momentum optimizer
optimizer = optim.NesterovMomentum()

# Use with a linear model
dataset = datasets.Phishing()
model = (
    preprocessing.StandardScaler() |
    linear_model.LogisticRegression(optimizer)
)
metric = metrics.F1()

# Evaluate
score = evaluate.progressive_val_score(dataset, model, metric)
print(score)  # F1: 84.22%

# Custom parameters
optimizer = optim.NesterovMomentum(lr=0.05, rho=0.95)
model = linear_model.LogisticRegression(optimizer)

# Compare with standard Momentum
momentum = optim.Momentum(lr=0.1, rho=0.9)
nesterov = optim.NesterovMomentum(lr=0.1, rho=0.9)

model1 = linear_model.LogisticRegression(momentum)
model2 = linear_model.LogisticRegression(nesterov)

# Typically converges faster than standard momentum
optimizer = optim.NesterovMomentum(lr=0.01, rho=0.9)
model = linear_model.LinearRegression(optimizer)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment