Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Rapidsai Cuml SMO Solver

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Support_Vector_Machines
Last Updated 2026-02-08 12:00 GMT

Overview

The GPU-accelerated Sequential Minimal Optimization (SMO) solver that trains Support Vector Machine models using a two-level decomposition approach.

Description

smosolver.h implements the core SVM training algorithm using a hierarchical decomposition strategy. The general approach follows Osuna's decomposition method, where a subset of training examples is selected at each iteration and the quadratic programming (QP) sub-problem is solved for that subset.

The implementation uses two levels of decomposition:

  • Outer level (q1=1024) -- Selects a working set of up to 1024 training vectors and solves the QP sub-problem (QP1). This is implemented in the Solve method.
  • Inner level (q2=2, SMO) -- Within each working set, pairs of variables are optimized using the classic SMO algorithm via SmoBlockSolve.

The SmoSolver class template supports both SVM Classification (SVC) and epsilon-SVR regression. Key methods include:

  • Solve -- The main entry point that iterates outer and inner loops until convergence. It manages the working set, kernel cache, and convergence tracking. Output includes dual coefficients, support vectors, their indices, and the bias term.
  • Initialize -- Sets up the optimization problem, zero-initializing dual coefficients and computing the initial optimality indicator vector f.
  • SvcInit -- Initializes the f vector for classification: f_i = -y_i (since alpha starts at zero).
  • SvrInit -- Initializes for epsilon-SVR by doubling the problem size (alpha+ and alpha- for each sample) and setting f_i = y_i * epsilon - y_regression_i.
  • UpdateF -- Updates the optimality indicator vector after each block solve step using the kernel matrix tile and change in dual coefficients.
  • CheckStoppingCondition -- Monitors convergence by tracking the optimality gap, detecting stalls (unchanged diff over consecutive steps), and warning about non-monotonic convergence.

The solver uses RMM device buffers for all GPU memory management and supports both precomputed kernel matrices and on-the-fly kernel evaluation via the cuVS GramMatrix interface.

References:

  • Joachims (1998) - Large-scale SVM practical methods
  • Vanek et al. (2017) - GPU-optimized hierarchical decomposition for SVM
  • Wen et al. (2018) - ThunderSVM

Usage

This class is used internally by the cuML SVM implementation. It is not typically called directly by users; instead, use the Python cuml.svm.SVC or cuml.svm.SVR classes.

Code Reference

Source Location

Signature

namespace ML {
namespace SVM {

template <typename math_t>
class SmoSolver {
 public:
  SmoSolver(const raft::handle_t& handle,
            SvmParameter param,
            cuvs::distance::kernels::KernelType kernel_type,
            cuvs::distance::kernels::GramMatrixBase<math_t>* kernel,
            bool is_precomputed = false);

  template <typename MatrixViewType>
  void Solve(MatrixViewType matrix, int n_rows, int n_cols,
             math_t* y, const math_t* sample_weight,
             math_t** dual_coefs, int* n_support,
             SupportStorage<math_t>* support_matrix,
             int** idx, math_t* b,
             int max_iter = -1, int max_outer_iter = -1,
             int max_inner_iter = 10000);

  void Initialize(math_t** y, const math_t* sample_weight, int n_rows, int n_cols);
  void SvcInit(const math_t* y);
  void SvrInit(const math_t* yr, int n_rows, math_t* yc, math_t* f);
  void UpdateF(math_t* f, int n_rows, const math_t* delta_alpha,
               int n_ws, const math_t* cacheTile);
  int GetNIter();
};

} // namespace SVM
} // namespace ML

Import

#include "smosolver.h"
// Dependencies:
#include <cuml/svm/svm_model.h>
#include <raft/core/handle.hpp>
#include <cuvs/distance/grammian.hpp>

I/O Contract

Inputs

Name Type Required Description
handle raft::handle_t Yes RAFT handle with CUDA stream and cuBLAS context
param SvmParameter Yes SVM parameters including C, tol, epsilon, cache_size, svmType, verbosity
kernel_type KernelType Yes Type of kernel function (RBF, linear, polynomial, etc.)
kernel GramMatrixBase<math_t>* Yes Kernel function evaluator
matrix MatrixViewType Yes Training data matrix [n_rows x n_cols]
y math_t* Yes Labels (+/-1 for SVC, target values for SVR), size [n_rows]
sample_weight math_t* No Per-sample weights (nullptr for uniform)
max_outer_iter int No Maximum outer iterations (default: 100 * n_rows)
max_inner_iter int No Maximum inner SMO iterations per block (default: 10000)

Outputs

Name Type Description
dual_coefs math_t** Device array of dual coefficients, size [n_support]
n_support int* Number of support vectors found
support_matrix SupportStorage<math_t>* Support vectors in matrix format [n_support x n_cols]
idx int** Original training set indices of support vectors, size [n_support]
b math_t* Bias term for the decision function

Usage Examples

// Internal usage within cuML SVM implementation
raft::handle_t handle;
SvmParameter param;
param.C = 1.0;
param.tol = 1e-3;
param.svmType = C_SVC;

auto* kernel = new cuvs::distance::kernels::RBFKernel<float>(gamma);

SmoSolver<float> solver(handle, param,
                        cuvs::distance::kernels::KernelType::RBF, kernel);

float* dual_coefs = nullptr;
int n_support = 0;
SupportStorage<float> support_matrix;
int* idx = nullptr;
float b;

solver.Solve(matrix_view, n_rows, n_cols, y, nullptr,
             &dual_coefs, &n_support, &support_matrix, &idx, &b);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment