Implementation:Microsoft Onnxruntime CPU ReductionAll
| Knowledge Sources | |
|---|---|
| Domains | Training, CPU_Kernels |
| Last Updated | 2026-02-10 04:00 GMT |
Overview
Concrete tool for computing the L2 norm across all input tensors on CPU in the ONNX Runtime training framework.
Description
This file implements the ReduceAllL2 kernel, which computes the L2 norm (Euclidean norm) across all elements of all input tensors combined. It accepts a variable number of input tensors, computes the sum of squared elements across all of them using ReduceAggregatorSumSquare, and then takes the square root of the total. The result is a single scalar value. This is useful for computing gradient norms for monitoring or clipping.
Usage
This kernel is invoked during training to compute the global L2 norm of gradients across all model parameters. It is typically used for gradient norm monitoring, gradient clipping thresholds, or loss scaling decisions.
Code Reference
Source Location
- Repository: Microsoft_Onnxruntime
- File: orttraining/orttraining/training_ops/cpu/reduction/reduction_all.cc
- Lines: 1-50
Signature
template <typename TIn, typename TOut>
Status ReduceAllL2<TIn, TOut>::Compute(OpKernelContext* ctx) const;
Import
#include "orttraining/orttraining/training_ops/cpu/reduction/reduction_all.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| tensors (variadic) | Tensor(float) | Yes | Variable number of input tensors |
Outputs
| Name | Type | Description |
|---|---|---|
| l2_norm | Tensor(float) | Scalar L2 norm across all input elements |
Usage Examples
ONNX_OPERATOR_TYPED_KERNEL_EX(
ReduceAllL2, kMSDomain, 1, float_float, kCpuExecutionProvider,
KernelDefBuilder()
.TypeConstraint("TIn", DataTypeImpl::GetTensorType<float>())
.TypeConstraint("TOut", DataTypeImpl::GetTensorType<float>()),
ReduceAllL2<float, float>);