Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Microsoft Onnxruntime CPU ReductionAll

From Leeroopedia


Knowledge Sources
Domains Training, CPU_Kernels
Last Updated 2026-02-10 04:00 GMT

Overview

Concrete tool for computing the L2 norm across all input tensors on CPU in the ONNX Runtime training framework.

Description

This file implements the ReduceAllL2 kernel, which computes the L2 norm (Euclidean norm) across all elements of all input tensors combined. It accepts a variable number of input tensors, computes the sum of squared elements across all of them using ReduceAggregatorSumSquare, and then takes the square root of the total. The result is a single scalar value. This is useful for computing gradient norms for monitoring or clipping.

Usage

This kernel is invoked during training to compute the global L2 norm of gradients across all model parameters. It is typically used for gradient norm monitoring, gradient clipping thresholds, or loss scaling decisions.

Code Reference

Source Location

Signature

template <typename TIn, typename TOut>
Status ReduceAllL2<TIn, TOut>::Compute(OpKernelContext* ctx) const;

Import

#include "orttraining/orttraining/training_ops/cpu/reduction/reduction_all.h"

I/O Contract

Inputs

Name Type Required Description
tensors (variadic) Tensor(float) Yes Variable number of input tensors

Outputs

Name Type Description
l2_norm Tensor(float) Scalar L2 norm across all input elements

Usage Examples

ONNX_OPERATOR_TYPED_KERNEL_EX(
    ReduceAllL2, kMSDomain, 1, float_float, kCpuExecutionProvider,
    KernelDefBuilder()
        .TypeConstraint("TIn", DataTypeImpl::GetTensorType<float>())
        .TypeConstraint("TOut", DataTypeImpl::GetTensorType<float>()),
    ReduceAllL2<float, float>);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment