Implementation:Sktime Pytorch forecasting DecoderMLP

Knowledge Sources	Sktime_Pytorch_forecasting
Domains	Time_Series, Forecasting, Deep_Learning
Last Updated	2026-02-08 08:00 GMT

Overview

DecoderMLP is an MLP-based decoder model that predicts output solely based on information available in the decoder time steps.

Description

DecoderMLP extends BaseModelWithCovariates and implements a simple multi-layer perceptron architecture that operates exclusively on decoder (future known) variables. Unlike encoder-decoder models, it does not use historical target values, making it suitable for scenarios where only future-known covariates are available for prediction. The model uses a FullyConnectedModule with configurable hidden layers, dropout, normalization, and activation functions, and supports both single-target and multi-target forecasting with categorical embeddings.

Usage

Use DecoderMLP when you want a simple baseline model that relies only on known future information (decoder variables) for forecasting. It is useful when future covariates carry strong predictive signal and encoder history is less important. The default loss is QuantileLoss, making it suitable for probabilistic forecasting out of the box.

Code Reference

Source Location

Repository: Sktime_Pytorch_forecasting
File: pytorch_forecasting/models/mlp/_decodermlp.py
Lines: 1-200

Signature

class DecoderMLP(BaseModelWithCovariates):
    def __init__(
        self,
        activation_class: str = "ReLU",
        hidden_size: int = 300,
        n_hidden_layers: int = 3,
        dropout: float = 0.1,
        norm: bool = True,
        static_categoricals: list[str] | None = None,
        static_reals: list[str] | None = None,
        time_varying_categoricals_encoder: list[str] | None = None,
        time_varying_categoricals_decoder: list[str] | None = None,
        categorical_groups: dict[str, list[str]] | None = None,
        time_varying_reals_encoder: list[str] | None = None,
        time_varying_reals_decoder: list[str] | None = None,
        embedding_sizes: dict[str, tuple[int, int]] | None = None,
        embedding_paddings: list[str] | None = None,
        embedding_labels: dict[str, np.ndarray] | None = None,
        x_reals: list[str] | None = None,
        x_categoricals: list[str] | None = None,
        output_size: int | list[int] = 1,
        target: str | list[str] = None,
        loss: MultiHorizonMetric = None,
        logging_metrics: nn.ModuleList = None,
        **kwargs,
    ):

from_dataset Signature

@classmethod
def from_dataset(cls, dataset: TimeSeriesDataSet, **kwargs):

Import

from pytorch_forecasting.models.mlp import DecoderMLP

I/O Contract

Inputs

Name	Type	Required	Description
activation_class	str	No	PyTorch activation class name. Defaults to "ReLU".
hidden_size	int	No	Width of hidden layers. Defaults to 300.
n_hidden_layers	int	No	Number of hidden layers. Defaults to 3.
dropout	float	No	Dropout rate. Defaults to 0.1.
norm	bool	No	Whether to apply normalization in the MLP. Defaults to True.
output_size	list[int]	No	Number of outputs per target. Defaults to 1.
target	list[str]	No	Target variable name(s). Defaults to None.
loss	MultiHorizonMetric	No	Loss function. Defaults to QuantileLoss().
logging_metrics	nn.ModuleList	No	Metrics for logging. Defaults to [SMAPE, MAE, RMSE, MAPE, MASE].
embedding_sizes	None	No	Categorical variable embedding sizes.
static_categoricals	None	No	Names of static categorical variables.
static_reals	None	No	Names of static continuous variables.

Outputs

Name	Type	Description
prediction	torch.Tensor	Forecast output transformed to target space, shape (batch_size, prediction_length, output_size).

Usage Examples

from pytorch_forecasting import TimeSeriesDataSet
from pytorch_forecasting.models.mlp import DecoderMLP

# Create model from dataset (preferred)
model = DecoderMLP.from_dataset(
    dataset,
    hidden_size=300,
    n_hidden_layers=3,
    dropout=0.1,
    activation_class="ReLU",
)

# Direct instantiation
model = DecoderMLP(
    activation_class="ReLU",
    hidden_size=256,
    n_hidden_layers=4,
    dropout=0.2,
    norm=True,
    target="demand",
    output_size=7,  # e.g. 7 quantiles
)

Related Pages

Principle:Sktime_Pytorch_forecasting_MLP_Decoder
Sktime_Pytorch_forecasting_RecurrentNetwork - Alternative model using recurrent layers instead of MLP

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment