Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Microsoft Onnxruntime CUDA SoftmaxCrossEntropy

From Leeroopedia
Revision as of 15:45, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Microsoft_Onnxruntime_CUDA_SoftmaxCrossEntropy.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Training, CUDA_Kernels
Last Updated 2026-02-10 04:00 GMT

Overview

Concrete tool for computing dense softmax cross-entropy and sparse softmax cross-entropy with their gradients in the ONNX Runtime CUDA training framework.

Description

Implements three operator pairs for CUDA: (1) SoftmaxCrossEntropy / SoftmaxCrossEntropyGrad for dense (one-hot) labels where label and logit shapes are identical; (2) SparseSoftmaxCrossEntropy / SparseSoftmaxCrossEntropyGrad for sparse (integer index) labels where logit has one extra dimension. The forward passes compute log-softmax, then calculate cross-entropy loss per sample, and reduce using cuDNN or custom reduction. Both support Mean and Sum reduction modes. The sparse variant supports optional per-sample weights and computes normalization factors accordingly. Gradients are computed as exp(log_prob) - label (dense) or the sparse equivalent, normalized by the appropriate factor. Registered in kMSDomain for dense and kOnnxDomain for sparse variants.

Usage

Used during training when the model employs softmax cross-entropy loss, either with dense (one-hot) labels or sparse (integer) labels in the MS and ONNX operator domains respectively.

Code Reference

Source Location

Signature

template <typename T>
Status SoftmaxCrossEntropy<T>::ComputeInternal(OpKernelContext* ctx) const;

template <typename T>
Status SoftmaxCrossEntropyGrad<T>::ComputeInternal(OpKernelContext* ctx) const;

template <typename T, typename Tin>
Status SparseSoftmaxCrossEntropy<T, Tin>::ComputeInternal(OpKernelContext* ctx) const;

template <typename T, typename Tin>
Status SparseSoftmaxCrossEntropyGrad<T, Tin>::ComputeInternal(OpKernelContext* ctx) const;

Import

#include "orttraining/training_ops/cuda/loss/softmaxcrossentropy_impl.h"

I/O Contract

Inputs

Name Type Required Description
logit Tensor(T) Yes Input logits with shape [N, D]
label Tensor(T/Tin) Yes Dense labels with shape [N, D] or sparse labels with shape [N]
weight Tensor(T) No Per-sample weights (sparse variant only) with shape [N]

Outputs

Name Type Description
loss Tensor(T) Scalar total loss value
log_prob Tensor(T) Log-softmax output with shape matching logit

Usage Examples

// Dense cross-entropy: SoftmaxCrossEntropy<float> registered at kMSDomain v1
// Sparse cross-entropy: SparseSoftmaxCrossEntropy<float, int64_t> registered at kOnnxDomain v9
// Gradient variants follow the same pattern with "Grad" suffix

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment