Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Rapidsai Cuml SimpleDenseMat

From Leeroopedia


Knowledge Sources
Domains Machine_Learning, Linear_Algebra
Last Updated 2026-02-08 12:00 GMT

Overview

A lightweight GPU dense matrix and vector library used internally by the cuML Quasi-Newton (QN) solver for generalized linear models, providing GEMM, element-wise operations, and norm computations.

Description

dense.hpp defines a family of GPU matrix and vector types used by the Quasi-Newton optimization solver in cuML. These types wrap raw device pointers with dimension and storage-order metadata, providing a clean API for linear algebra operations without the overhead of full matrix library abstractions.

The header defines the following types and utilities:

SimpleDenseMat<T> -- A non-owning dense matrix view supporting both column-major and row-major storage orders. Key methods include:

  • gemm -- Static method implementing general matrix multiplication via cuBLAS, with automatic handling of mixed storage orders by transposing as needed.
  • gemmb / assign_gemm -- Instance methods for GEMM with this as one of the operands.
  • ax -- Scalar-matrix multiply (this = a*x).
  • axpy -- Scaled addition (this = a*x + y).
  • assign_unary / assign_binary / assign_ternary -- Element-wise operations with custom lambdas.
  • fill -- Fill matrix with a constant value.
  • copy_async -- Asynchronous device-to-device copy.

SimpleVec<T> -- A vector type extending SimpleDenseMat<T> with n=1, providing assign_gemv for matrix-vector multiplication.

SimpleVecOwning<T> / SimpleMatOwning<T> -- Owning variants that manage their own device memory via rmm::device_uvector.

Free functions:

  • col_ref -- Create a vector view referencing a single column of a column-major matrix.
  • col_slice -- Create a matrix view referencing a contiguous range of columns.
  • dot, squaredNorm, nrm1, nrm2, nrmMax -- Vector reduction operations using raft primitives.
  • operator<< -- Stream output operators for debugging.

The GEMM implementation handles mixed storage orders by recursively converting row-major matrices to equivalent transposed column-major representations.

Usage

These types are used internally by the QN solver (cpp/src/glm/qn/) for gradient computations, Hessian-vector products, and line search operations in logistic regression, linear regression, and other GLM models.

Code Reference

Source Location

  • Repository: Rapidsai_Cuml
  • File: cpp/src/glm/qn/simple_mat/dense.hpp

Signature

namespace ML {

enum STORAGE_ORDER { COL_MAJOR = 0, ROW_MAJOR = 1 };

template <typename T>
struct SimpleDenseMat : SimpleMat<T> {
  int len;
  T* data;
  STORAGE_ORDER ord;

  SimpleDenseMat(T* data, int m, int n, STORAGE_ORDER order = COL_MAJOR);
  void reset(T* data_, int m_, int n_);

  static void gemm(const raft::handle_t& handle,
                    const T alpha, const SimpleDenseMat<T>& A, const bool transA,
                    const SimpleDenseMat<T>& B, const bool transB,
                    const T beta, SimpleDenseMat<T>& C, cudaStream_t stream);

  void ax(const T a, const SimpleDenseMat<T>& x, cudaStream_t stream);
  void axpy(const T a, const SimpleDenseMat<T>& x, const SimpleDenseMat<T>& y, cudaStream_t stream);
  void fill(const T val, cudaStream_t stream);
  void copy_async(const SimpleDenseMat<T>& other, cudaStream_t stream);
};

template <typename T>
struct SimpleVec : SimpleDenseMat<T> {
  SimpleVec(T* data, const int n);
  void assign_gemv(const raft::handle_t& handle, const T alpha,
                   const SimpleDenseMat<T>& A, bool transA,
                   const SimpleVec<T>& x, const T beta, cudaStream_t stream);
};

template <typename T>
T dot(const SimpleVec<T>& u, const SimpleVec<T>& v, T* tmp_dev, cudaStream_t stream);

template <typename T>
T nrm2(const SimpleVec<T>& u, T* tmp_dev, cudaStream_t stream);

} // namespace ML

Import

#include "dense.hpp"
// or from another directory:
#include <glm/qn/simple_mat/dense.hpp>

I/O Contract

Inputs

Name Type Required Description
data T* Yes Device pointer to the matrix data
m int Yes Number of rows
n int Yes Number of columns
order STORAGE_ORDER No Storage order: COL_MAJOR (default) or ROW_MAJOR
handle raft::handle_t Yes (for GEMM) RAFT handle providing cuBLAS context
stream cudaStream_t Yes CUDA stream for asynchronous operations

Outputs

Name Type Description
Result matrix/vector SimpleDenseMat<T> or SimpleVec<T> Modified in-place with operation results
Scalar reductions T Dot products, norms returned as host scalars

Usage Examples

// Create matrix views over existing device memory
SimpleDenseMat<float> A(d_A, m, k, COL_MAJOR);
SimpleDenseMat<float> B(d_B, k, n, COL_MAJOR);
SimpleDenseMat<float> C(d_C, m, n, COL_MAJOR);

// C = 1.0 * A * B + 0.0 * C
SimpleDenseMat<float>::gemm(handle, 1.0f, A, false, B, false, 0.0f, C, stream);

// Vector operations
SimpleVec<float> u(d_u, n);
SimpleVec<float> v(d_v, n);
float result = dot(u, v, d_tmp, stream);
float norm = nrm2(u, d_tmp, stream);

// Owning vector with automatic memory management
SimpleVecOwning<float> owned_vec(1024, stream);
owned_vec.fill(0.0f, stream);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment