Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Scikit learn Scikit learn BenchRCV1LogregConvergence

From Leeroopedia


Knowledge Sources
Domains Machine Learning, Benchmarking
Last Updated 2026-02-08 15:00 GMT

Overview

Concrete tool for benchmarking logistic regression convergence on the RCV1 dataset provided by scikit-learn.

Description

This benchmark script compares the convergence behavior of various logistic regression solvers on the RCV1 text classification dataset. It evaluates scikit-learn's LogisticRegression (with SAG, SAGA, and liblinear solvers), SGDClassifier, and optionally Lightning's implementations. The script uses joblib caching to speed up repeated runs and measures train/test loss and accuracy across different numbers of iterations.

Usage

Use this benchmark to evaluate which logistic regression solver converges fastest on large-scale sparse text classification problems, and to compare the convergence profiles of different optimization algorithms.

Code Reference

Source Location

Signature

def get_loss(w, intercept, myX, myy, C)
def bench_one(name, clf_type, clf_params, n_iter)

from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn.datasets import fetch_rcv1

Import

from sklearn.linear_model import LogisticRegression, SGDClassifier
from sklearn.datasets import fetch_rcv1

I/O Contract

Inputs

Name Type Required Description
name str Yes Identifier for the solver being benchmarked
clf_type class Yes Classifier class (LogisticRegression, SGDClassifier, etc.)
clf_params dict Yes Parameters for the classifier constructor
n_iter int Yes Number of iterations to run

Outputs

Name Type Description
train_loss float Logistic loss on training data
train_score float Accuracy on training data
test_score float Accuracy on test data
duration float Wall clock time of fit
Plot matplotlib figure Convergence curves for all solvers

Usage Examples

Basic Usage

from sklearn.datasets import fetch_rcv1
from sklearn.linear_model import LogisticRegression

rcv1 = fetch_rcv1()
X, y = rcv1.data, rcv1.target

clf = LogisticRegression(solver='saga', max_iter=100, random_state=42)
clf.fit(X, y)
print("Score:", clf.score(X, y))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment