Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft Onnxruntime CPU OpGradients

From Leeroopedia


Knowledge Sources
Domains Training, CPU_Kernels
Last Updated 2026-02-10 04:00 GMT

Overview

Concrete tool for computing basic activation and operation gradients (Relu, Softmax, LogSoftmax, Sigmoid, Tanh, QuickGelu, LeakyRelu) on CPU in the ONNX Runtime training framework.

Description

This file implements gradient kernels for fundamental neural network operations:

ReluGrad: Passes the upstream gradient where X > 0, zeros elsewhere: dX = (X > 0) ? dY : 0.

SoftmaxGrad / SoftmaxGrad_13: Computes the Jacobian-vector product for softmax: dX = Y * (dY - sum(Y * dY)). Supports opset 13 axis transposition. The LogSoftmax variant uses: dX = dY - sum(dY) * exp(Y).

SigmoidGrad: dX = dY * Y * (1 - Y).

TanhGrad: dX = dY * (1 - Y^2).

QuickGeluGrad: Uses the logistic sigmoid of alpha * X and computes: dX = dY * sigmoid(alpha*X) * (1 + alpha*X*(1 - sigmoid(alpha*X))). Uses MlasComputeLogistic for efficient sigmoid computation and parallel execution via thread pool.

LeakyReluGrad: dX = (Y > 0) ? dY : alpha * dY.

All kernels are registered under kMSDomain opset 1 for float type.

Usage

These kernels are invoked during the backward pass whenever their corresponding activation or operation nodes are present in the training graph. They represent the most commonly used activation gradients in deep learning.

Code Reference

Source Location

Signature

template <typename T>
Status ReluGrad<T>::Compute(OpKernelContext* context) const;

template <typename T>
Status SoftmaxGrad<T>::Compute(OpKernelContext* context) const;

template <typename T>
Status SigmoidGrad<T>::Compute(OpKernelContext* context) const;

template <typename T>
Status TanhGrad<T>::Compute(OpKernelContext* context) const;

template <typename T>
Status QuickGeluGrad<T>::Compute(OpKernelContext* context) const;

template <typename T>
Status LeakyReluGrad<T>::Compute(OpKernelContext* context) const;

Import

#include "orttraining/orttraining/training_ops/cpu/op_gradients.h"

I/O Contract

Inputs (Common Pattern)

Name Type Required Description
dY Tensor(float) Yes Upstream gradient
X_or_Y Tensor(float) Yes Forward input (ReluGrad, QuickGeluGrad) or forward output (others)

Outputs

Name Type Description
dX Tensor(float) Gradient w.r.t. input

Usage Examples

ONNX_OPERATOR_KERNEL_EX(
    ReluGrad, kMSDomain, 1, kCpuExecutionProvider,
    KernelDefBuilder().TypeConstraint("T", DataTypeImpl::GetTensorType<float>()),
    ReluGrad<float>);

ONNX_OPERATOR_KERNEL_EX(
    SoftmaxGrad, kMSDomain, 1, kCpuExecutionProvider,
    KernelDefBuilder().TypeConstraint("T", DataTypeImpl::GetTensorType<float>()),
    SoftmaxGrad<float>);

ONNX_OPERATOR_KERNEL_EX(
    SigmoidGrad, kMSDomain, 1, kCpuExecutionProvider,
    KernelDefBuilder().TypeConstraint("T", DataTypeImpl::GetTensorType<float>()),
    SigmoidGrad<float>);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment