Principle:Sktime Pytorch forecasting Distribution Loss
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Loss_Functions, Probabilistic_Forecasting |
| Last Updated | 2026-02-08 07:00 GMT |
Overview
Loss function that trains models to predict the parameters of a probability distribution, enabling parametric probabilistic forecasts via negative log-likelihood minimization.
Description
Distribution Loss is a parametric approach to probabilistic forecasting where the model learns to output the parameters of a known distribution family (e.g., Normal, NegativeBinomial, LogNormal). The loss is the negative log-likelihood of the observed target under the predicted distribution. This approach leverages domain knowledge about the data distribution (e.g., count data suits NegativeBinomial, continuous data suits Normal) and provides well-calibrated uncertainty estimates. The NormalDistributionLoss specifically models targets as Gaussian with learned location and scale parameters, applying an affine rescaling transformation to undo target normalization.
Usage
Use NormalDistributionLoss as the default loss for DeepAR when forecasting continuous-valued targets. Choose alternative distribution losses based on data characteristics: NegativeBinomialDistributionLoss for count data, LogNormalDistributionLoss for strictly positive data with heavy right tails, BetaDistributionLoss for bounded [0,1] data.
Theoretical Basis
Negative log-likelihood loss:
For Normal distribution:
The model outputs raw parameters which are transformed:
- loc (μ) — used directly
- scale (σ) — passed through softplus: to ensure positivity
Rescaling: The predicted distribution is an affine transformation of the base distribution:
Where center and scale come from the target normalizer (GroupNormalizer).