Implementation:Rapidsai Cuml KNN API
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Nearest_Neighbors |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Provides the C++ API for GPU-accelerated k-nearest neighbors operations in cuML, including brute-force KNN search, approximate KNN index building and querying, as well as KNN-based classification, regression, and class probability estimation.
Description
The knn.hpp header declares a comprehensive set of functions for performing k-nearest neighbors operations on NVIDIA GPUs. The API is organized into several functional groups:
Brute-Force KNN:
brute_force_knn: Performs exact KNN search across multiple input arrays, combining results into unified output index and distance arrays. Supports configurable distance metrics, row/column-major layouts, and partition translation indices.
Random Ball Cover (RBC) Index:
rbc_build_index: Builds a Random Ball Cover spatial index for efficient approximate nearest neighbor queries.rbc_knn_query: Queries the RBC index for nearest neighbors.rbc_free_index: Frees the device memory associated with an RBC index.
Approximate KNN (IVF-based):
approx_knn_build_index: Builds an approximate KNN index using IVF-Flat or IVF-PQ parameters via FAISS.approx_knn_search: Searches an approximate KNN index for nearest neighbors.knnIndex,knnIndexParam,IVFParam,IVFFlatParam,IVFPQParam: Structs for configuring and storing index state.
KNN-based Supervised Learning:
knn_classify: Performs KNN classification using precomputed KNN indices and label arrays, supporting multilabel classification.knn_regress: Performs KNN regression using precomputed KNN indices and target value arrays.knn_class_proba: Computes class probabilities from precomputed KNN indices.
All functions operate on device memory and use the RAFT handle for GPU resource management.
Usage
Use this API for nearest-neighbor search tasks on GPU. Choose brute-force KNN for exact results on smaller datasets, approximate KNN (IVF-Flat/IVF-PQ) for large-scale approximate search, or the RBC index for moderate-scale approximate search. The classification, regression, and probability functions are used after a KNN query to perform supervised learning based on neighbor labels.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
cpp/include/cuml/neighbors/knn.hpp
Signature
namespace ML {
void brute_force_knn(const raft::handle_t& handle,
std::vector<float*>& input,
std::vector<int>& sizes,
int D,
float* search_items,
int n,
int64_t* res_I,
float* res_D,
int k,
bool rowMajorIndex = false,
bool rowMajorQuery = false,
ML::distance::DistanceType metric = ML::distance::DistanceType::L2Expanded,
float metric_arg = 2.0f,
std::vector<int64_t>* translations = nullptr);
void rbc_build_index(const raft::handle_t& handle,
std::uintptr_t& rbc_index,
float* X, int64_t n_rows, int64_t n_cols,
ML::distance::DistanceType metric);
void rbc_knn_query(const raft::handle_t& handle,
const std::uintptr_t& rbc_index,
uint32_t k, const float* search_items,
uint32_t n_search_items, int64_t dim,
int64_t* out_inds, float* out_dists);
void rbc_free_index(std::uintptr_t rbc_index);
struct knnIndex {
knnIndex();
~knnIndex();
ML::distance::DistanceType metric;
float metricArg;
int nprobe;
int device;
std::unique_ptr<knnIndexImpl> pimpl;
};
struct knnIndexParam { virtual ~knnIndexParam() {} };
struct IVFParam : knnIndexParam { int nlist; int nprobe; };
struct IVFFlatParam : IVFParam {};
struct IVFPQParam : IVFParam { int M; int n_bits; bool usePrecomputedTables; };
void approx_knn_build_index(raft::handle_t& handle, knnIndex* index,
knnIndexParam* params,
ML::distance::DistanceType metric, float metricArg,
float* index_array, int n, int D);
void approx_knn_search(raft::handle_t& handle, float* distances,
int64_t* indices, knnIndex* index, int k,
float* query_array, int n);
void knn_classify(raft::handle_t& handle, int* out, int64_t* knn_indices,
std::vector<int*>& y, size_t n_index_rows,
size_t n_query_rows, int k, float* sample_weight = nullptr);
void knn_regress(raft::handle_t& handle, float* out, int64_t* knn_indices,
std::vector<float*>& y, size_t n_index_rows,
size_t n_query_rows, int k, float* sample_weight = nullptr);
void knn_class_proba(raft::handle_t& handle, std::vector<float*>& out,
int64_t* knn_indices, std::vector<int*>& y,
size_t n_index_rows, size_t n_query_rows, int k,
float* sample_weight = nullptr);
} // namespace ML
Import
#include <cuml/neighbors/knn.hpp>
I/O Contract
Inputs
brute_force_knn
| Name | Type | Required | Description |
|---|---|---|---|
| handle | const raft::handle_t& | Yes | RAFT handle for GPU resources |
| input | std::vector<float*>& | Yes | Vector of device pointers to index arrays |
| sizes | std::vector<int>& | Yes | Vector of row counts for each input array |
| D | int | Yes | Dimensionality of the data |
| search_items | float* | Yes | Device pointer to query array [n x D] |
| n | int | Yes | Number of query rows |
| res_I | int64_t* | Yes | Output device pointer for result indices [n x k] |
| res_D | float* | Yes | Output device pointer for result distances [n x k] |
| k | int | Yes | Number of nearest neighbors |
| rowMajorIndex | bool | No | Whether index arrays are row-major (default: false) |
| rowMajorQuery | bool | No | Whether query array is row-major (default: false) |
| metric | ML::distance::DistanceType | No | Distance metric (default: L2Expanded) |
| metric_arg | float | No | Metric argument for Minkowski distances (default: 2.0) |
| translations | std::vector<int64_t>* | No | Translation IDs for non-contiguous partitions (default: nullptr) |
knn_classify
| Name | Type | Required | Description |
|---|---|---|---|
| handle | raft::handle_t& | Yes | RAFT handle |
| out | int* | Yes | Output device array for predicted labels [n_query_rows] |
| knn_indices | int64_t* | Yes | Device array of KNN indices [n_query_rows x k] |
| y | std::vector<int*>& | Yes | Vector of label arrays on device, one per output |
| n_index_rows | size_t | Yes | Number of rows in the index (size of each y array) |
| n_query_rows | size_t | Yes | Number of query samples |
| k | int | Yes | Number of nearest neighbors |
| sample_weight | float* | No | Optional sample weights [n_query_rows x k] (default: nullptr) |
Outputs
| Name | Type | Description |
|---|---|---|
| res_I | int64_t* | KNN result index array of size [n x k] |
| res_D | float* | KNN result distance array of size [n x k] |
| out (classify) | int* | Predicted class labels |
| out (regress) | float* | Predicted regression values |
| out (class_proba) | std::vector<float*>& | Class probability arrays per output |
Usage Examples
#include <cuml/neighbors/knn.hpp>
raft::handle_t handle;
// Brute force KNN search
std::vector<float*> index_arrays = {d_index_part1, d_index_part2};
std::vector<int> sizes = {5000, 3000};
int D = 128;
int n_queries = 100;
int k = 10;
int64_t* d_indices; // pre-allocated [n_queries x k]
float* d_distances; // pre-allocated [n_queries x k]
ML::brute_force_knn(handle, index_arrays, sizes, D,
d_query, n_queries, d_indices, d_distances, k);
// KNN classification using the results
std::vector<int*> labels = {d_labels};
int* d_predictions; // pre-allocated [n_queries]
ML::knn_classify(handle, d_predictions, d_indices, labels,
8000, n_queries, k);