Implementation:Rapidsai Cuml Classification Metrics
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification, Model_Evaluation |
| Last Updated | 2026-02-08 12:00 GMT |
Overview
Provides GPU-accelerated classification evaluation metrics including accuracy score and log loss (cross-entropy loss).
Description
The _classification.py module implements two core classification metrics on the GPU:
accuracy_score-- Computes the fraction (or count) of correctly classified samples. Supports optional sample weights and anormalizeflag that controls whether the result is a fraction or an absolute count. It handles both CuPy arrays and cuDF Series inputs, including categorical dtype alignment.
log_loss-- Computes the negative log-likelihood (cross-entropy) loss, the standard loss function for logistic regression and neural networks. Supports binary and multiclass problems, probability clipping via anepsparameter, and optional sample weights with anormalizetoggle.
An internal helper _input_to_cupy_or_cudf_series coerces heterogeneous inputs (cuDF Series, CuPy, NumPy, Numba) into a uniform 1-D representation, handling edge cases like string labels (via cuDF) and float16 (via CuPy).
Usage
Use these functions to evaluate classification model quality on GPU data. They serve as drop-in replacements for the corresponding scikit-learn metrics, accepting device arrays and host arrays alike. Typical usage is after model prediction to measure predictive accuracy or calibration quality.
Code Reference
Source Location
- Repository: Rapidsai_Cuml
- File:
python/cuml/cuml/metrics/_classification.py
Signature
def accuracy_score(y_true, y_pred, *, sample_weight=None, normalize=True)
def log_loss(
y_true, y_pred, eps=1e-15, normalize=True, sample_weight=None
) -> float
Import
from cuml.metrics import accuracy_score
from cuml.metrics import log_loss
I/O Contract
Inputs -- accuracy_score
| Name | Type | Required | Description |
|---|---|---|---|
| y_true | array-like of shape (n_samples,) | Yes | Ground truth (correct) labels. |
| y_pred | array-like of shape (n_samples,) | Yes | Predicted labels, as returned by a classifier. |
| sample_weight | array-like of shape (n_samples,) | No | Per-sample weights. Default None.
|
| normalize | bool | No | If True (default), return fraction of correct predictions; if False, return the count.
|
Inputs -- log_loss
| Name | Type | Required | Description |
|---|---|---|---|
| y_true | array-like of shape (n_samples,) | Yes | True labels (integer-valued, non-negative). |
| y_pred | array-like of shape (n_samples, n_classes) or (n_samples,) | Yes | Predicted probabilities or decision values. |
| eps | float | No | Clipping bound for probabilities to avoid log(0). Default 1e-15.
|
| normalize | bool | No | If True (default), return mean loss per sample; otherwise return total sum.
|
| sample_weight | array-like of shape (n_samples,) | No | Per-sample weights. Default None.
|
Outputs
| Name | Type | Description |
|---|---|---|
| score (accuracy_score) | float | Fraction of correctly classified samples, or count if normalize=False.
|
| loss (log_loss) | float | The (mean or summed) cross-entropy loss. |
Usage Examples
import cupy as cp
from cuml.metrics import accuracy_score, log_loss
# Accuracy score
y_true = cp.array([0, 1, 2, 3])
y_pred = cp.array([0, 2, 1, 3])
print(accuracy_score(y_true, y_pred)) # 0.5
# Log loss (binary classification with probability predictions)
y_true = cp.array([1, 0, 0, 1])
y_pred_proba = cp.array([[0.1, 0.9], [0.9, 0.1], [0.8, 0.2], [0.35, 0.65]])
print(log_loss(y_true, y_pred_proba)) # ~0.2162