Implementation:Scikit learn Scikit learn SAG Solver

Knowledge Sources	Scikit_learn Scikit-learn Docs
Domains	Machine Learning, Stochastic Optimization
Last Updated	2026-02-08 15:00 GMT

Overview

Concrete tool for solving Ridge and Logistic Regression optimization problems using the Stochastic Average Gradient (SAG/SAGA) algorithm provided by scikit-learn.

Description

The sag_solver function implements the Stochastic Average Gradient (SAG) and SAGA optimization algorithms for fitting Ridge regression and Logistic Regression models. SAG maintains a running average of past gradients, achieving a linear convergence rate while only computing one gradient per iteration. SAGA is a variant that supports non-strongly convex composite objectives and non-smooth penalties. The module also provides the get_auto_step_size helper for computing the theoretically optimal step size based on the maximum squared sum of features and the regularization parameter.

Usage

The SAG solver is used internally by Ridge and LogisticRegression when solver='sag' or solver='saga' is specified. It is especially efficient for large datasets where computing the full gradient at each step would be expensive. SAG/SAGA are preferred when you have a large number of samples and want faster convergence than vanilla SGD while maintaining low per-iteration cost.

Code Reference

Source Location

Repository: scikit-learn
File: sklearn/linear_model/_sag.py

Signature

def sag_solver(
    X,
    y,
    sample_weight=None,
    loss="log",
    alpha=1.0,
    beta=0.0,
    max_iter=1000,
    tol=0.001,
    verbose=0,
    random_state=None,
    check_input=True,
    max_squared_sum=None,
    warm_start_mem=None,
    is_saga=False,
):

def get_auto_step_size(
    max_squared_sum, alpha_scaled, loss, fit_intercept,
    n_samples=None, is_saga=False,
):

Import

from sklearn.linear_model._sag import sag_solver

I/O Contract

Inputs

Name	Type	Required	Description
X	ndarray	Yes	Training data of shape (n_samples, n_features)
y	ndarray	Yes	Target values of shape (n_samples,) or (n_samples, n_classes)
sample_weight	ndarray	No	Weight assigned to each sample (default=None)
loss	str	No	Loss function: 'log', 'squared', or 'multinomial' (default='log')
alpha	float	No	L2 regularization strength (default=1.0)
beta	float	No	L1 regularization strength for SAGA (default=0.0)
max_iter	int	No	Maximum number of passes over the data (default=1000)
tol	float	No	Convergence tolerance based on relative change (default=0.001)
random_state	int or RandomState	No	Random seed for sample selection
max_squared_sum	float	No	Maximum squared sum of X over samples for step size computation
warm_start_mem	dict	No	Dictionary containing previous gradient memory for warm start
is_saga	bool	No	Whether to use SAGA variant instead of SAG (default=False)

Outputs

Name	Type	Description
coef_	ndarray	Optimized weight coefficients
n_iter_	int	Number of iterations actually performed
warm_start_mem	dict	Gradient memory dict that can be used for warm starting

Usage Examples

Basic Usage

# SAG solver is typically used internally via Ridge or LogisticRegression.
from sklearn.linear_model import Ridge
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=1000, n_features=20, noise=10, random_state=42)
model = Ridge(alpha=1.0, solver="sag", random_state=42)
model.fit(X, y)
print("Score:", model.score(X, y))

Related Pages

Principle:Scikit_learn_Scikit_learn_Online_Learning

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment