Implementation:Microsoft Onnxruntime CPU ReductionOps
| Knowledge Sources | |
|---|---|
| Domains | Training, CPU_Kernels |
| Last Updated | 2026-02-10 04:00 GMT |
Overview
Concrete tool for the ReduceSumTraining kernel on CPU in the ONNX Runtime training framework.
Description
This file implements the ReduceSumTraining kernel, a training-specific variant of ReduceSum that supports the noop_with_empty_axes parameter. This parameter controls behavior when the axes list is empty: if true, the operation is a no-op (output equals input); if false, all axes are reduced. The kernel delegates to CommonReduce1Loop<ReduceAggregatorSum<T>> for the actual computation. It supports float, double, int32_t, and int64_t types, and respects the keepdims attribute.
Usage
This kernel is invoked during training when a ReduceSum operation is needed that supports the noop_with_empty_axes behavior. It is used for gradient reduction and aggregation in the training graph.
Code Reference
Source Location
- Repository: Microsoft_Onnxruntime
- File: orttraining/orttraining/training_ops/cpu/reduction/reduction_ops.cc
- Lines: 1-37
Signature
template <typename T>
Status ReduceSumTraining<T>::Compute(OpKernelContext* ctx) const;
Import
#include "orttraining/orttraining/training_ops/cpu/reduction/reduction_ops.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| data | Tensor(T) | Yes | Input tensor to reduce |
Outputs
| Name | Type | Description |
|---|---|---|
| reduced | Tensor(T) | Reduced output tensor (with or without keepdims) |
Usage Examples
ONNX_OPERATOR_TYPED_KERNEL_EX(
ReduceSumTraining, kMSDomain, 1, float, kCpuExecutionProvider,
KernelDefBuilder()
.TypeConstraint("T", DataTypeImpl::GetTensorType<float>()),
ReduceSumTraining<float>);
ONNX_OPERATOR_TYPED_KERNEL_EX(
ReduceSumTraining, kMSDomain, 1, double, kCpuExecutionProvider,
KernelDefBuilder()
.TypeConstraint("T", DataTypeImpl::GetTensorType<double>()),
ReduceSumTraining<double>);