Principle:Sktime Pytorch forecasting Group Normalization
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Data_Engineering, Preprocessing |
| Last Updated | 2026-02-08 07:00 GMT |
Overview
Technique for normalizing time series targets per group (individual series) to improve model training by removing level differences across series.
Description
Group Normalization applies per-series (or per-group) standardization to the target variable. In multi-series forecasting, different series often have vastly different scales (e.g., high-volume vs. low-volume products). Training on raw values would cause the model to disproportionately fit high-magnitude series. Group normalization computes per-group statistics (mean/std for standard scaling, or quantile-based for robust scaling) and normalizes each series to zero mean and unit variance. The normalization parameters are stored as additional features (target_scale) passed to the model so it can denormalize predictions. This is critical for distributional models like DeepAR where the distribution parameters must be rescaled back to the original data space.
Usage
Use GroupNormalizer as the target_normalizer in TimeSeriesDataSet construction whenever the dataset contains multiple time series with different scales. This is standard practice for TFT and DeepAR workflows. The groups parameter should match the group_ids of the dataset. Optional transformation (log, logit, softplus) can be applied before normalization for non-Gaussian targets.
Theoretical Basis
Standard normalization per group g:
Where and are computed from the encoder window of group g.
Robust normalization (alternative):
Denormalization for predictions:
With transformation (e.g., log):
# Abstract normalization pipeline
y_transformed = log(y) # transformation
y_normalized = (y_transformed - center) / scale # standardization
# Inverse at prediction time:
y_pred_transformed = y_pred_normalized * scale + center
y_pred = exp(y_pred_transformed) # inverse transformation