Principle:Tensorflow Tfjs Normalization Techniques
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Optimization, Neural_Network_Architecture |
| Last Updated | 2026-02-10 06:00 GMT |
Overview
Techniques that normalize intermediate activations within neural networks to stabilize training, accelerate convergence, and act as implicit regularizers.
Description
Normalization layers transform activations to have zero mean and unit variance, then apply learnable scale (gamma) and shift (beta) parameters. TensorFlow.js implements:
- BatchNormalization: Normalizes across the batch dimension. Most effective for CNNs with large batch sizes.
- LayerNormalization: Normalizes across the feature dimension per sample. Preferred for RNNs and Transformer models, independent of batch size.
Both techniques address the internal covariate shift problem where the distribution of layer inputs changes during training, making optimization difficult.
Usage
Apply BatchNormalization after convolutional or dense layers in CNN architectures (typically before the activation function). Use LayerNormalization in sequence models and Transformers where batch statistics are unreliable due to variable sequence lengths or small batch sizes.
Theoretical Basis
Where the statistics are computed over different dimensions depending on the technique:
- Batch Norm: mean/variance over the batch and spatial dimensions
- Layer Norm: mean/variance over the feature dimensions per sample