Implementation:Rapidsai Cuml Package Init
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Package_Management, GPU_Computing |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
The top-level cuml/__init__.py module serves as the primary entry point for the cuML library, loading the native GPU library, configuring the CuPy memory allocator, and re-exporting all major classes and functions at the package level.
Description
This module performs three key responsibilities:
1. Native library loading: Attempts to import the libcuml wheel package and calls libcuml.load_library() to load the shared C++/CUDA library symbols. If libcuml is not installed as a wheel (e.g., in a system install), the import is silently skipped.
2. RMM memory allocator setup: Configures CuPy to use RAPIDS Memory Manager (RMM) via cupy.cuda.set_allocator(rmm_cupy_allocator). This ensures all CuPy allocations go through RMM, enabling unified memory pool management across RAPIDS libraries.
3. Public API re-exports: Imports and re-exports all major cuML classes and functions so users can access them directly from the cuml namespace. This includes:
- Clustering:
KMeans,DBSCAN,HDBSCAN,AgglomerativeClustering - Linear models:
LinearRegression,LogisticRegression,Ridge,Lasso,ElasticNet - Ensemble:
RandomForestClassifier,RandomForestRegressor - Decomposition:
PCA,TruncatedSVD,IncrementalPCA - Manifold:
UMAP,TSNE - Neighbors:
NearestNeighbors,KNeighborsClassifier,KNeighborsRegressor - SVM:
SVC,SVR,LinearSVC,LinearSVR - Time series:
ARIMA,AutoARIMA,ExponentialSmoothing - Explainability:
KernelExplainer,PermutationExplainer,TreeExplainer - Metrics:
accuracy_score,adjusted_rand_score,r2_score - Datasets:
make_blobs,make_classification,make_regression,make_arima - Utilities:
train_test_split,ForestInference,LabelEncoder,Base
A lazy __getattr__ hook provides deferred initialization of the global_settings singleton (GlobalSettings), which is created on first access.
Usage
This module is automatically executed when a user writes import cuml. It sets up the GPU environment and provides the top-level namespace from which all cuML functionality is accessed.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
python/cuml/cuml/__init__.py
Signature
# Module-level initialization (no callable signature)
# Lazy attribute accessor for global_settings
def __getattr__(name):
if name == "global_settings":
...
raise AttributeError(f"module {__name__} has no attribute {name}")
Import
import cuml
# Access classes directly
model = cuml.LinearRegression()
kmeans = cuml.KMeans(n_clusters=5)
score = cuml.r2_score(y_true, y_pred)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (none) | -- | -- | This is a package initialization module. It takes no direct inputs. Importing cuml triggers the initialization.
|
Outputs (Exported Symbols)
| Name | Type | Description |
|---|---|---|
| __version__ | str | The cuML package version string. |
| __git_commit__ | str | The git commit hash of the build. |
| global_settings | GlobalSettings | Lazily-initialized singleton for global cuML configuration (output type, verbosity, etc.). |
| (all classes/functions) | various | All major cuML estimators, metrics, dataset generators, and utilities listed in __all__.
|
Key Re-exported Classes and Functions
| Category | Symbols |
|---|---|
| Clustering | AgglomerativeClustering, DBSCAN, HDBSCAN, KMeans
|
| Linear Models | LinearRegression, LogisticRegression, Ridge, Lasso, ElasticNet, MBSGDClassifier, MBSGDRegressor
|
| Ensemble | RandomForestClassifier, RandomForestRegressor
|
| Decomposition | PCA, TruncatedSVD, IncrementalPCA
|
| Manifold | UMAP, TSNE
|
| Neighbors | NearestNeighbors, KNeighborsClassifier, KNeighborsRegressor, KernelDensity
|
| SVM | SVC, SVR, LinearSVC, LinearSVR
|
| Time Series | ARIMA, AutoARIMA, ExponentialSmoothing
|
| Explainability | KernelExplainer, PermutationExplainer, TreeExplainer
|
| Metrics | accuracy_score, adjusted_rand_score, r2_score
|
| Solvers | CD, QN, SGD
|
| Other | ForestInference, KernelRidge, LabelEncoder, LedoitWolf, MultinomialNB, Base
|
Usage Examples
import cuml
# Check version
print(cuml.__version__)
# Use a model directly from the top-level namespace
from cuml import KMeans, make_blobs
X, y = make_blobs(n_samples=1000, n_features=10, centers=5)
model = KMeans(n_clusters=5)
model.fit(X)
print("Cluster centers shape:", model.cluster_centers_.shape)
# Set global output type
cuml.set_global_output_type('numpy')