Implementation:Rapidsai Cuml MBSGDClassifier

Knowledge Sources	Rapidsai_Cuml
Domains	Machine_Learning, Classification
Last Updated	2026-02-08 12:00 GMT

Overview

MBSGDClassifier provides a GPU-accelerated mini-batch stochastic gradient descent classifier that supports linear SVM, logistic regression, and linear regression loss functions with configurable regularization.

Description

The MBSGDClassifier class implements binary linear classification using mini-batch stochastic gradient descent (SGD). It supports three loss functions: 'hinge' (linear SVM), 'log' (logistic regression), and 'squared_loss' (linear regression). Regularization options include L1, L2, Elastic-Net, or no penalty.

This implementation is experimental and uses a different algorithm than scikit-learn's SGDClassifier. It only supports binary classification (exactly two classes). The model processes data in configurable batch sizes and supports multiple learning rate schedules: constant, inverse scaling, and adaptive. The fit method internally uses cuML's fit_sgd solver. Prediction thresholds differ by loss function: 0 for hinge loss and 0.5 for log/squared loss.

To improve results, the documentation recommends reducing batch size, increasing eta0, and increasing the number of epochs.

Usage

Use MBSGDClassifier for binary classification tasks where you want to leverage GPU-accelerated mini-batch SGD training with configurable loss functions. It is suitable for large datasets where full-batch methods would be too slow, though users should note that it is experimental and may require tuning of learning rate and batch size parameters.

Code Reference

Source Location

Repository: Rapidsai_Cuml
File: python/cuml/cuml/linear_model/mbsgd_classifier.py

Signature

class MBSGDClassifier(Base, LinearClassifierMixin, ClassifierMixin, FMajorInputTagMixin):
    def __init__(
        self,
        *,
        loss="hinge",
        penalty="l2",
        alpha=0.0001,
        l1_ratio=0.15,
        fit_intercept=True,
        epochs=1000,
        tol=1e-3,
        shuffle=True,
        learning_rate="constant",
        eta0=0.001,
        power_t=0.5,
        batch_size=32,
        n_iter_no_change=5,
        verbose=False,
        output_type=None,
    )

Import

from cuml.linear_model import MBSGDClassifier

I/O Contract

Inputs

Name	Type	Required	Description
loss	str	No	Loss function: 'hinge' (linear SVM), 'log' (logistic regression), or 'squared_loss' (linear regression). Default is 'hinge'.
penalty	str or None	No	Regularization: 'l1', 'l2', 'elasticnet', or None. Default is 'l2'.
alpha	float	No	Regularization strength constant. Default is 0.0001.
l1_ratio	float	No	Elastic-Net mixing parameter (0 <= l1_ratio <= 1). Only used when penalty='elasticnet'. Default is 0.15.
fit_intercept	bool	No	Whether to fit a bias term. Default is True.
epochs	int	No	Number of passes over the entire dataset. Default is 1000.
tol	float	No	Stopping tolerance: stops if current_loss > previous_loss - tol. Default is 1e-3.
shuffle	bool	No	Whether to shuffle training data after each epoch. Default is True.
learning_rate	str	No	Learning rate schedule: 'constant', 'invscaling', or 'adaptive'. Default is 'constant'.
eta0	float	No	Initial learning rate. Default is 0.001.
power_t	float	No	Exponent for invscaling learning rate. Default is 0.5.
batch_size	int	No	Number of samples per mini-batch. Default is 32.
n_iter_no_change	int	No	Number of epochs without improvement before stopping or adapting. Default is 5.
verbose	int or bool	No	Sets logging level. Default is False.
output_type	str or None	No	Return results in the indicated output type.

Outputs

Name	Type	Description
coef_	array (n_features,)	The learned model coefficients.
intercept_	float	Independent term (bias). Zero if fit_intercept is False.
classes_	np.ndarray (n_classes,)	Array of class labels (exactly 2 for binary classification).

Usage Examples

Basic Usage

import cupy as cp
import cuml

# Create sample data
X = cp.array([[1, 1], [1, 2], [2, 2], [2, 3]], dtype=cp.float32)
y = cp.array([1, 1, 2, 2])

# Fit MBSGDClassifier
model = cuml.MBSGDClassifier(
    loss="hinge",
    penalty="l2",
    alpha=0.0001,
    epochs=1000,
    eta0=0.001,
    batch_size=32
).fit(X, y)

# Predict on new data
X_test = cp.asarray([[3, 5], [2, 5]], dtype=cp.float32)
predictions = model.predict(X_test)
print(predictions)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment