Implementation:LaurentMazare Tch rs Layer Norm

Knowledge Sources	LaurentMazare_Tch_rs
Domains	Neural Networks, Normalization, Deep Learning
Last Updated	2026-02-08 00:00 GMT

Overview

The layer_norm module implements Layer Normalization, which normalizes across the feature dimensions specified by a normalized_shape, commonly used in Transformer architectures and recurrent networks.

Description

LayerNorm computes the mean and variance over the last D dimensions of the input tensor (where D is the length of normalized_shape) and normalizes each element. Unlike Batch Normalization, Layer Normalization operates on each sample independently and does not maintain running statistics.

The LayerNormConfig struct holds: cudnn_enabled (defaults to true), eps (epsilon for numerical stability, defaults to 1e-5), elementwise_affine (whether to apply learnable scale and shift, defaults to true), ws_init (weight initialization, defaults to Const(1.)), and bs_init (bias initialization, defaults to Const(0.)).

The LayerNorm struct stores the config, optional weight and bias tensors (allocated only when elementwise_affine is true), and the normalized_shape as a Vec<i64>. The constructor function layer_norm takes a variable store path, the normalized shape, and the config. The layer implements Module::forward by delegating to Tensor::layer_norm.

Usage

Use LayerNorm in Transformer models (applied after attention and feed-forward sublayers), recurrent networks, and any architecture where batch-independent normalization is desired.

Code Reference

Source Location

Repository: LaurentMazare_Tch_rs
File: src/nn/layer_norm.rs

Signature

#[derive(Debug, Clone, Copy)]
pub struct LayerNormConfig {
    pub cudnn_enabled: bool,
    pub eps: f64,
    pub elementwise_affine: bool,
    pub ws_init: super::Init,
    pub bs_init: super::Init,
}

#[derive(Debug)]
pub struct LayerNorm {
    config: LayerNormConfig,
    pub ws: Option<Tensor>,
    pub bs: Option<Tensor>,
    pub normalized_shape: Vec<i64>,
}

pub fn layer_norm<'a, T: Borrow<super::Path<'a>>>(
    vs: T,
    normalized_shape: Vec<i64>,
    config: LayerNormConfig,
) -> LayerNorm;

Import

use tch::nn::{layer_norm, LayerNormConfig};

I/O Contract

Parameter	Type	Description
vs	impl Borrow<Path>	Variable store path for parameter allocation
normalized_shape	Vec<i64>	Shape of the dimensions to normalize over (typically the last D dims)
config	LayerNormConfig	Configuration struct

Config Field	Default Value	Description
cudnn_enabled	true	Enable cuDNN acceleration
eps	1e-5	Epsilon for numerical stability
elementwise_affine	true	Learn per-element scale (weight) and shift (bias)
ws_init	Const(1.)	Weight initialization strategy
bs_init	Const(0.)	Bias initialization strategy

Forward Input	Forward Output
&Tensor whose trailing dimensions match normalized_shape	Tensor of same shape, layer-normalized

Usage Examples

use tch::{nn, nn::Module, Device, Kind, Tensor};

let vs = nn::VarStore::new(Device::Cpu);
let root = vs.root();

// Normalize over the last dimension of size 512 (typical in Transformers)
let ln = nn::layer_norm(&root / "ln", vec![512], Default::default());

// Forward pass
let input = Tensor::randn([8, 32, 512], (Kind::Float, Device::Cpu));
let output = ln.forward(&input);
// output shape: [8, 32, 512], normalized over the last dimension

Related Pages

Principle:LaurentMazare_Tch_rs_Layer_Normalization

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment