Implementation:LaurentMazare Tch rs Layer Norm
| Knowledge Sources | |
|---|---|
| Domains | Neural Networks, Normalization, Deep Learning |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
The layer_norm module implements Layer Normalization, which normalizes across the feature dimensions specified by a normalized_shape, commonly used in Transformer architectures and recurrent networks.
Description
LayerNorm computes the mean and variance over the last D dimensions of the input tensor (where D is the length of normalized_shape) and normalizes each element. Unlike Batch Normalization, Layer Normalization operates on each sample independently and does not maintain running statistics.
The LayerNormConfig struct holds: cudnn_enabled (defaults to true), eps (epsilon for numerical stability, defaults to 1e-5), elementwise_affine (whether to apply learnable scale and shift, defaults to true), ws_init (weight initialization, defaults to Const(1.)), and bs_init (bias initialization, defaults to Const(0.)).
The LayerNorm struct stores the config, optional weight and bias tensors (allocated only when elementwise_affine is true), and the normalized_shape as a Vec<i64>. The constructor function layer_norm takes a variable store path, the normalized shape, and the config. The layer implements Module::forward by delegating to Tensor::layer_norm.
Usage
Use LayerNorm in Transformer models (applied after attention and feed-forward sublayers), recurrent networks, and any architecture where batch-independent normalization is desired.
Code Reference
Source Location
- Repository: LaurentMazare_Tch_rs
- File: src/nn/layer_norm.rs
Signature
#[derive(Debug, Clone, Copy)]
pub struct LayerNormConfig {
pub cudnn_enabled: bool,
pub eps: f64,
pub elementwise_affine: bool,
pub ws_init: super::Init,
pub bs_init: super::Init,
}
#[derive(Debug)]
pub struct LayerNorm {
config: LayerNormConfig,
pub ws: Option<Tensor>,
pub bs: Option<Tensor>,
pub normalized_shape: Vec<i64>,
}
pub fn layer_norm<'a, T: Borrow<super::Path<'a>>>(
vs: T,
normalized_shape: Vec<i64>,
config: LayerNormConfig,
) -> LayerNorm;
Import
use tch::nn::{layer_norm, LayerNormConfig};
I/O Contract
| Parameter | Type | Description |
|---|---|---|
| vs | impl Borrow<Path> | Variable store path for parameter allocation |
| normalized_shape | Vec<i64> | Shape of the dimensions to normalize over (typically the last D dims) |
| config | LayerNormConfig | Configuration struct |
| Config Field | Default Value | Description |
|---|---|---|
| cudnn_enabled | true | Enable cuDNN acceleration |
| eps | 1e-5 | Epsilon for numerical stability |
| elementwise_affine | true | Learn per-element scale (weight) and shift (bias) |
| ws_init | Const(1.) | Weight initialization strategy |
| bs_init | Const(0.) | Bias initialization strategy |
| Forward Input | Forward Output |
|---|---|
| &Tensor whose trailing dimensions match normalized_shape | Tensor of same shape, layer-normalized |
Usage Examples
use tch::{nn, nn::Module, Device, Kind, Tensor};
let vs = nn::VarStore::new(Device::Cpu);
let root = vs.root();
// Normalize over the last dimension of size 512 (typical in Transformers)
let ln = nn::layer_norm(&root / "ln", vec![512], Default::default());
// Forward pass
let input = Tensor::randn([8, 32, 512], (Kind::Float, Device::Cpu));
let output = ln.forward(&input);
// output shape: [8, 32, 512], normalized over the last dimension