Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:LaurentMazare Tch rs Layer Norm

From Leeroopedia


Knowledge Sources
Domains Neural Networks, Normalization, Deep Learning
Last Updated 2026-02-08 00:00 GMT

Overview

The layer_norm module implements Layer Normalization, which normalizes across the feature dimensions specified by a normalized_shape, commonly used in Transformer architectures and recurrent networks.

Description

LayerNorm computes the mean and variance over the last D dimensions of the input tensor (where D is the length of normalized_shape) and normalizes each element. Unlike Batch Normalization, Layer Normalization operates on each sample independently and does not maintain running statistics.

The LayerNormConfig struct holds: cudnn_enabled (defaults to true), eps (epsilon for numerical stability, defaults to 1e-5), elementwise_affine (whether to apply learnable scale and shift, defaults to true), ws_init (weight initialization, defaults to Const(1.)), and bs_init (bias initialization, defaults to Const(0.)).

The LayerNorm struct stores the config, optional weight and bias tensors (allocated only when elementwise_affine is true), and the normalized_shape as a Vec<i64>. The constructor function layer_norm takes a variable store path, the normalized shape, and the config. The layer implements Module::forward by delegating to Tensor::layer_norm.

Usage

Use LayerNorm in Transformer models (applied after attention and feed-forward sublayers), recurrent networks, and any architecture where batch-independent normalization is desired.

Code Reference

Source Location

Signature

#[derive(Debug, Clone, Copy)]
pub struct LayerNormConfig {
    pub cudnn_enabled: bool,
    pub eps: f64,
    pub elementwise_affine: bool,
    pub ws_init: super::Init,
    pub bs_init: super::Init,
}

#[derive(Debug)]
pub struct LayerNorm {
    config: LayerNormConfig,
    pub ws: Option<Tensor>,
    pub bs: Option<Tensor>,
    pub normalized_shape: Vec<i64>,
}

pub fn layer_norm<'a, T: Borrow<super::Path<'a>>>(
    vs: T,
    normalized_shape: Vec<i64>,
    config: LayerNormConfig,
) -> LayerNorm;

Import

use tch::nn::{layer_norm, LayerNormConfig};

I/O Contract

Parameter Type Description
vs impl Borrow<Path> Variable store path for parameter allocation
normalized_shape Vec<i64> Shape of the dimensions to normalize over (typically the last D dims)
config LayerNormConfig Configuration struct
Config Field Default Value Description
cudnn_enabled true Enable cuDNN acceleration
eps 1e-5 Epsilon for numerical stability
elementwise_affine true Learn per-element scale (weight) and shift (bias)
ws_init Const(1.) Weight initialization strategy
bs_init Const(0.) Bias initialization strategy
Forward Input Forward Output
&Tensor whose trailing dimensions match normalized_shape Tensor of same shape, layer-normalized

Usage Examples

use tch::{nn, nn::Module, Device, Kind, Tensor};

let vs = nn::VarStore::new(Device::Cpu);
let root = vs.root();

// Normalize over the last dimension of size 512 (typical in Transformers)
let ln = nn::layer_norm(&root / "ln", vec![512], Default::default());

// Forward pass
let input = Tensor::randn([8, 32, 512], (Kind::Float, Device::Cpu));
let output = ln.forward(&input);
// output shape: [8, 32, 512], normalized over the last dimension

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment