Implementation:Microsoft Onnxruntime CPU ReductionAll

Knowledge Sources	Microsoft_Onnxruntime
Domains	Training, CPU_Kernels
Last Updated	2026-02-10 04:00 GMT

Overview

Concrete tool for computing the L2 norm across all input tensors on CPU in the ONNX Runtime training framework.

Description

This file implements the ReduceAllL2 kernel, which computes the L2 norm (Euclidean norm) across all elements of all input tensors combined. It accepts a variable number of input tensors, computes the sum of squared elements across all of them using ReduceAggregatorSumSquare, and then takes the square root of the total. The result is a single scalar value. This is useful for computing gradient norms for monitoring or clipping.

Usage

This kernel is invoked during training to compute the global L2 norm of gradients across all model parameters. It is typically used for gradient norm monitoring, gradient clipping thresholds, or loss scaling decisions.

Code Reference

Source Location

Repository: Microsoft_Onnxruntime
File: orttraining/orttraining/training_ops/cpu/reduction/reduction_all.cc
Lines: 1-50

Signature

template <typename TIn, typename TOut>
Status ReduceAllL2<TIn, TOut>::Compute(OpKernelContext* ctx) const;

Import

#include "orttraining/orttraining/training_ops/cpu/reduction/reduction_all.h"

I/O Contract

Inputs

Name	Type	Required	Description
tensors (variadic)	Tensor(float)	Yes	Variable number of input tensors

Outputs

Name	Type	Description
l2_norm	Tensor(float)	Scalar L2 norm across all input elements

Usage Examples

ONNX_OPERATOR_TYPED_KERNEL_EX(
    ReduceAllL2, kMSDomain, 1, float_float, kCpuExecutionProvider,
    KernelDefBuilder()
        .TypeConstraint("TIn", DataTypeImpl::GetTensorType<float>())
        .TypeConstraint("TOut", DataTypeImpl::GetTensorType<float>()),
    ReduceAllL2<float, float>);

Related Pages

Environment:Microsoft_Onnxruntime_CPU_Training_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment