Implementation:Rapidsai Cuml Confusion Matrix
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification, Model_Evaluation |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Computes a GPU-accelerated confusion matrix to evaluate the accuracy of a classification model.
Description
The confusion_matrix function builds a square confusion matrix C of shape (n_classes, n_classes) where entry C[i, j] is the number of observations known to be in class i and predicted to be in class j. It uses CuPy sparse COO matrices (cupyx.scipy.sparse.coo_matrix) for efficient construction, then converts to a dense array.
Key features:
- Label handling -- When
labelsisNone, labels are inferred from the sorted union ofy_trueandy_pred. A customlabelsarray can be passed to reorder or subset the matrix. - Sample weights -- An optional
sample_weightarray replaces the default unit counts. - Normalization -- The
normalizeparameter supports"true"(row-normalize),"pred"(column-normalize),"all"(total-normalize), orNone(raw counts). - Label monotonization -- Internally uses
cuml.prims.label.make_monotonicto map arbitrary integer labels to contiguous indices.
Usage
Use this function after classification to understand which classes are being confused. The confusion matrix is a foundation for computing precision, recall, F1-score, and other per-class metrics. It is commonly visualized as a heatmap.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
python/cuml/cuml/metrics/confusion_matrix.py
Signature
def confusion_matrix(
y_true,
y_pred,
labels=None,
sample_weight=None,
normalize=None,
convert_dtype=False,
) -> CumlArray
Import
from cuml.metrics import confusion_matrix
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| y_true | array-like (device or host) of shape (n_samples,) | Yes | Ground truth (correct) target labels. Must be integer type (int32 or int64). |
| y_pred | array-like (device or host) of shape (n_samples,) | Yes | Predicted target labels. Must be integer type (int32 or int64). |
| labels | array-like of shape (n_classes,) | No | List of label indices for the matrix. If None, inferred from the union of y_true and y_pred.
|
| sample_weight | array-like of shape (n_samples,) | No | Per-sample weights. Default is uniform (1.0). |
| normalize | str in ['true', 'pred', 'all'] or None | No | Normalization mode. "true" normalizes over rows, "pred" over columns, "all" over the entire matrix. Default None (raw counts).
|
| convert_dtype | bool | No | When True, automatically convert inputs to int32. Default False.
|
Outputs
| Name | Type | Description |
|---|---|---|
| C | CumlArray of shape (n_classes, n_classes) | The confusion matrix. Entry C[i, j] is the count (or normalized proportion) of samples with true label i predicted as label j.
|
Usage Examples
import cupy as cp
from cuml.metrics import confusion_matrix
y_true = cp.array([2, 0, 2, 2, 0, 1])
y_pred = cp.array([0, 0, 2, 2, 0, 2])
# Basic confusion matrix
cm = confusion_matrix(y_true, y_pred)
print(cm)
# [[2, 0, 0],
# [0, 0, 1],
# [1, 0, 2]]
# Normalized over true labels (rows)
cm_norm = confusion_matrix(y_true, y_pred, normalize='true')
print(cm_norm)