Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Sktime Pytorch forecasting Recurrent Architecture

From Leeroopedia


Knowledge Sources
Domains Time_Series, Forecasting, Deep_Learning, Recurrent_Neural_Networks
Last Updated 2026-02-08 09:00 GMT

Overview

Autoregressive recurrent neural network architecture for time series forecasting that uses LSTM or GRU cells with covariate support, encoding the lookback window into a hidden state and decoding predictions step-by-step.

Description

The Recurrent Network model implements a standard encoder-decoder forecasting pattern built on recurrent cells (LSTM or GRU). During the encoding phase, the model processes the historical (encoder) time steps through a multi-layer RNN to produce a hidden state that summarizes the lookback period. During the decoding phase, the hidden state is used to generate predictions either in teacher-forcing mode (training) or in autoregressive mode (inference).

The input at each time step is constructed by concatenating continuous features with embedded categorical features. A critical design choice is the target shifting mechanism: the target variable is shifted by one time step (via roll), so that at each decoding step the model receives the previous time step's target rather than the current one, preserving the causal structure required for autoregressive generation.

In training mode, the decoder processes the entire decoder sequence at once using the ground-truth targets (teacher forcing). In evaluation mode, the model runs autoregressively: it predicts one step at a time, feeds each prediction back as the next input, and iterates for the full prediction horizon. This autoregressive decoding is handled by the decode_autoregressive mechanism inherited from the base class.

The model supports both single-target and multi-target forecasting. For single targets, a single linear projection maps the RNN hidden output to the forecast. For multiple targets, separate linear projectors are used for each target. Target lags can be specified to inject known seasonal patterns into the input vector.

Usage

Use the Recurrent Network for time series forecasting when: (1) the data has sequential dependencies suited to recurrent processing, (2) both static and time-varying covariates (categorical and continuous) are available, (3) the prediction horizon is moderate (autoregressive decoding becomes slow for very long horizons). Note that QuantileLoss is not supported with this architecture. The model requires that encoder and decoder share the same covariate set (apart from the target variable and its lags).

Theoretical Basis

Encoding: The encoder RNN processes the historical input sequence and produces a hidden state:

ht,ct=RNN(xt,ht1,ct1),t=1,,Tenc1

where xt is the concatenation of continuous features and categorical embeddings at time t. The first time step is consumed by the target shift, so effective encoder length is Tenc1.

Target shifting (causal input construction):

xt[target]=yt1

The target features in the input vector are rolled by one position so that each step receives the previous observation, preventing information leakage.

Teacher-forced decoding (training):

y^1:H=WoRNN(x1:Hdec,henc)

where Wo is the output projection and the entire decoder sequence is processed at once.

Autoregressive decoding (inference): For each step t in the forecast horizon:

xt[target]=y^t1,y^t=WoRNN(xt,ht1)

Predictions are generated one at a time and fed back as inputs, including any lagged target positions.

Multi-layer RNN: The architecture stacks multiple RNN layers with inter-layer dropout (applied when rnn_layers > 1):

ht(l)=RNNCell(l)(ht(l1),ht1(l)),l=1,,Nlayers

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment