Principle:Sktime Pytorch forecasting Recurrent Architecture
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Forecasting, Deep_Learning, Recurrent_Neural_Networks |
| Last Updated | 2026-02-08 09:00 GMT |
Overview
Autoregressive recurrent neural network architecture for time series forecasting that uses LSTM or GRU cells with covariate support, encoding the lookback window into a hidden state and decoding predictions step-by-step.
Description
The Recurrent Network model implements a standard encoder-decoder forecasting pattern built on recurrent cells (LSTM or GRU). During the encoding phase, the model processes the historical (encoder) time steps through a multi-layer RNN to produce a hidden state that summarizes the lookback period. During the decoding phase, the hidden state is used to generate predictions either in teacher-forcing mode (training) or in autoregressive mode (inference).
The input at each time step is constructed by concatenating continuous features with embedded categorical features. A critical design choice is the target shifting mechanism: the target variable is shifted by one time step (via roll), so that at each decoding step the model receives the previous time step's target rather than the current one, preserving the causal structure required for autoregressive generation.
In training mode, the decoder processes the entire decoder sequence at once using the ground-truth targets (teacher forcing). In evaluation mode, the model runs autoregressively: it predicts one step at a time, feeds each prediction back as the next input, and iterates for the full prediction horizon. This autoregressive decoding is handled by the decode_autoregressive mechanism inherited from the base class.
The model supports both single-target and multi-target forecasting. For single targets, a single linear projection maps the RNN hidden output to the forecast. For multiple targets, separate linear projectors are used for each target. Target lags can be specified to inject known seasonal patterns into the input vector.
Usage
Use the Recurrent Network for time series forecasting when: (1) the data has sequential dependencies suited to recurrent processing, (2) both static and time-varying covariates (categorical and continuous) are available, (3) the prediction horizon is moderate (autoregressive decoding becomes slow for very long horizons). Note that QuantileLoss is not supported with this architecture. The model requires that encoder and decoder share the same covariate set (apart from the target variable and its lags).
Theoretical Basis
Encoding: The encoder RNN processes the historical input sequence and produces a hidden state:
where is the concatenation of continuous features and categorical embeddings at time . The first time step is consumed by the target shift, so effective encoder length is .
Target shifting (causal input construction):
The target features in the input vector are rolled by one position so that each step receives the previous observation, preventing information leakage.
Teacher-forced decoding (training):
where is the output projection and the entire decoder sequence is processed at once.
Autoregressive decoding (inference): For each step in the forecast horizon:
Predictions are generated one at a time and fed back as inputs, including any lagged target positions.
Multi-layer RNN: The architecture stacks multiple RNN layers with inter-layer dropout (applied when rnn_layers > 1):