Implementation:Rapidsai Cuml Kernel SHAP
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Explainability |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Generates GPU-accelerated sample datasets for the Kernel SHAP (SHapley Additive exPlanations) algorithm, enabling model-agnostic feature importance explanations.
Description
The ML::Explainer::kernel_dataset function constructs the combinatorial dataset required by the Kernel SHAP algorithm. Given a binary mask matrix X (indicating which features to take from the observation vs. the background), a background dataset, and an observation row, it produces a "scattered" dataset where each row is a combination of observation and background feature values according to the mask.
The function handles both the exact part of the Kernel SHAP dataset (where the mask is fully specified) and the sampled part (where k entries are randomly selected). The nsamples array controls how many features are randomly sampled for each row of the mask, and maxsample specifies the largest sample size.
Each block in the GPU kernel scatters one row of the observation into the corresponding background rows in the output dataset, based on the binary mask in X.
Usage
Use this function as part of a Kernel SHAP pipeline to generate the perturbation dataset on the GPU. After generating the dataset, pass it through the model to obtain predictions, then compute SHAP values from the prediction differences. This accelerates the typically expensive Kernel SHAP sampling process.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
cpp/include/cuml/explainer/kernel_shap.hpp
Signature
namespace ML {
namespace Explainer {
void kernel_dataset(const raft::handle_t& handle,
float* X,
int nrows_X,
int ncols,
float* background,
int nrows_background,
float* dataset,
float* observation,
int* nsamples,
int len_nsamples,
int maxsample,
uint64_t seed = 0ULL);
} // namespace Explainer
} // namespace ML
Import
#include <cuml/explainer/kernel_shap.hpp>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| handle | const raft::handle_t& | Yes | cuML handle for GPU resource management |
| X | float* | Yes (inout) | Binary mask matrix on device [nrows_X x ncols], row-major; modified in-place for sampled rows |
| nrows_X | int | Yes | Number of rows in X (number of mask combinations) |
| ncols | int | Yes | Number of columns (features) shared by X, background, and dataset |
| background | float* | Yes | Background dataset on device [nrows_background x ncols] |
| nrows_background | int | Yes | Number of rows in the background dataset |
| observation | float* | Yes | The observation row to explain on device [ncols] |
| nsamples | int* | Yes | Array specifying number of features to randomly sample per mask row [len_nsamples] |
| len_nsamples | int | Yes | Number of entries in the nsamples array |
| maxsample | int | Yes | Size of the largest sample in nsamples |
| seed | uint64_t | No (default 0) | Seed for the random number generator |
Outputs
| Name | Type | Description |
|---|---|---|
| dataset | float* | Device pointer to the generated Kernel SHAP dataset [nrows_X * nrows_background x ncols], row-major |
| X | float* | Modified binary mask matrix (updated in-place for sampled rows) |
Usage Examples
#include <cuml/explainer/kernel_shap.hpp>
#include <raft/core/handle.hpp>
void run_kernel_shap() {
raft::handle_t handle;
int ncols = 4;
int nrows_X = 2;
int nrows_background = 2;
// Allocate and initialize device memory
float* X; // binary mask [nrows_X x ncols]
float* background; // background data [nrows_background x ncols]
float* dataset; // output [nrows_X * nrows_background x ncols]
float* observation; // single observation [ncols]
int* nsamples; // number of features to sample per row
cudaMalloc(&X, nrows_X * ncols * sizeof(float));
cudaMalloc(&background, nrows_background * ncols * sizeof(float));
cudaMalloc(&dataset, nrows_X * nrows_background * ncols * sizeof(float));
cudaMalloc(&observation, ncols * sizeof(float));
cudaMalloc(&nsamples, nrows_X * sizeof(int));
// Initialize X, background, observation, nsamples on device...
ML::Explainer::kernel_dataset(handle,
X, nrows_X, ncols,
background, nrows_background,
dataset, observation,
nsamples, nrows_X,
3, // maxsample
42ULL); // seed
handle.sync_stream();
// Pass dataset through model to get predictions, then compute SHAP values...
cudaFree(X);
cudaFree(background);
cudaFree(dataset);
cudaFree(observation);
cudaFree(nsamples);
}