Implementation:Rapidsai Cuml Input To Cuml Array

Knowledge Sources	cuML cuML Docs
Domains	Machine_Learning, Data_Engineering
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete tool for converting heterogeneous input data formats to GPU-resident CumlArray objects suitable for cuML clustering and other algorithms.

Description

The `input_to_cuml_array` function is the universal data conversion gateway in cuML. It accepts data in any supported format (cuDF DataFrame/Series, pandas DataFrame/Series, NumPy ndarray, CuPy ndarray, Numba CUDA device array, scipy/cupyx sparse matrices) and converts it to a GPU-resident `CumlArray` with validated dtype, memory layout, and contiguity.

This function is called internally by all cuML estimators before fitting or prediction to ensure data is in the correct format on the GPU device.

Usage

Called internally by cuML estimators during `fit()`, `predict()`, and `transform()`. Can also be used directly for manual data preparation before passing to multiple estimators.

Code Reference

input_to_cuml_array

Source Location

Repository: cuML
File: python/cuml/cuml/internals/input_utils.py
Lines: 265-374

Signature

def input_to_cuml_array(
    X,
    order="F",
    deepcopy=False,
    check_dtype=False,
    convert_to_dtype=False,
    check_mem_type=False,
    convert_to_mem_type="device",
    safe_dtype_conversion=True,
    check_cols=False,
    check_rows=False,
    fail_on_order=False,
    force_contiguous=True,
):

Import

from cuml.internals.input_utils import input_to_cuml_array

I/O Contract

Inputs

Name	Type	Required	Description
X	array-like	Yes	Input data: cuDF DataFrame/Series, pandas DataFrame/Series, NumPy ndarray, CuPy ndarray, Numba CUDA device array, scipy/cupyx sparse matrix.
order	str	No (default 'F')	Memory layout: 'F' (column-major, Fortran-style) or 'C' (row-major, C-style) or 'K' (keep existing).
deepcopy	bool	No (default False)	If True, always copy data. If False, copy only when necessary for conversion.
convert_to_dtype	dtype or False	No (default False)	Target dtype for conversion (e.g., np.float32). False means no conversion.
convert_to_mem_type	str	No (default 'device')	Target memory type: 'device' (GPU) or 'host' (CPU).
force_contiguous	bool	No (default True)	Ensure output array is contiguous in memory.

Outputs

Name	Type	Description
result	namedtuple	`cuml_array(array: CumlArray, n_rows: int, n_cols: int, dtype: np.dtype)` with data on GPU device memory.

Usage Examples

import numpy as np
from cuml.internals.input_utils import input_to_cuml_array

# Convert NumPy array to GPU CumlArray
X_np = np.random.rand(1000, 10).astype(np.float32)
result = input_to_cuml_array(X_np, order='C', convert_to_mem_type='device')
gpu_array = result.array  # CumlArray on GPU
n_rows = result.n_rows    # 1000
n_cols = result.n_cols     # 10

Related Pages

Implements Principle

Principle:Rapidsai_Cuml_Data_Preparation_For_Clustering

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment