Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sktime Pytorch forecasting mLSTMLayer

From Leeroopedia
Revision as of 16:43, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Sktime_Pytorch_forecasting_mLSTMLayer.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Time_Series, Forecasting, Deep_Learning
Last Updated 2026-02-08 08:00 GMT

Overview

mLSTMLayer stacks multiple mLSTM cells to form a deep recurrent layer with support for residual connections, layer normalization, and dropout.

Description

mLSTMLayer extends nn.Module and wraps multiple mLSTMCell instances into a multi-layer recurrent architecture. It processes input sequences time-step by time-step, passing data through each stacked cell. Residual connections can be enabled between layers (skipping the first layer), which helps gradient flow in deeper configurations. The layer manages hidden, cell, and normalized states across all stacked cells.

Usage

Use mLSTMLayer when you need a multi-layer mLSTM recurrent block as part of a forecasting or sequence modeling network. It is the intermediate building block between individual mLSTMCell instances and the complete mLSTMNetwork.

Code Reference

Source Location

Signature

class mLSTMLayer(nn.Module):
    def __init__(
        self,
        input_size,
        hidden_size,
        num_layers,
        dropout=0.2,
        layer_norm=True,
        residual_conn=True,
    ):
    def forward(self, x, h=None, c=None, n=None):
    def init_hidden(self, batch_size, device=None):

Import

from pytorch_forecasting.layers._recurrent._mlstm.layer import mLSTMLayer

I/O Contract

Inputs

__init__

Name Type Required Description
input_size int Yes The number of features in the input.
hidden_size int Yes The number of features in the hidden state.
num_layers int Yes The number of mLSTM layers to stack.
dropout float No Dropout probability applied to inputs and intermediate layers. Defaults to 0.2.
layer_norm bool No Whether to use layer normalization in each mLSTM cell. Defaults to True.
residual_conn bool No Whether to enable residual connections between layers. Defaults to True.

forward

Name Type Required Description
x torch.Tensor Yes Input tensor of shape (seq_len, batch_size, input_size). Internally transposed to (batch_size, seq_len, input_size).
h torch.Tensor or None No Initial hidden states for all layers. If None, initialized to zeros.
c torch.Tensor or None No Initial cell states for all layers. If None, initialized to zeros.
n torch.Tensor or None No Initial normalized states for all layers. If None, initialized to zeros.

Outputs

forward

Name Type Description
output torch.Tensor Final output tensor from the last layer, of shape (seq_len, batch_size, hidden_size).
(h, c, n) tuple of torch.Tensor Final hidden, cell, and normalized states for all layers. Each of shape (num_layers, batch_size, hidden_size).

init_hidden

Name Type Description
(h, c, n) tuple of torch.Tensor Stacked zero-initialized hidden, cell, and normalization states for all layers.

Usage Examples

import torch
from pytorch_forecasting.layers._recurrent._mlstm.layer import mLSTMLayer

layer = mLSTMLayer(
    input_size=32,
    hidden_size=64,
    num_layers=3,
    dropout=0.1,
    layer_norm=True,
    residual_conn=True,
)

seq_len, batch_size = 10, 16
x = torch.randn(seq_len, batch_size, 32)
output, (h, c, n) = layer(x)
# output shape: (10, 16, 64)
# h shape: (3, 16, 64)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment