Overview
mLSTMLayer stacks multiple mLSTM cells to form a deep recurrent layer with support for residual connections, layer normalization, and dropout.
Description
mLSTMLayer extends nn.Module and wraps multiple mLSTMCell instances into a multi-layer recurrent architecture. It processes input sequences time-step by time-step, passing data through each stacked cell. Residual connections can be enabled between layers (skipping the first layer), which helps gradient flow in deeper configurations. The layer manages hidden, cell, and normalized states across all stacked cells.
Usage
Use mLSTMLayer when you need a multi-layer mLSTM recurrent block as part of a forecasting or sequence modeling network. It is the intermediate building block between individual mLSTMCell instances and the complete mLSTMNetwork.
Code Reference
Source Location
Signature
class mLSTMLayer(nn.Module):
def __init__(
self,
input_size,
hidden_size,
num_layers,
dropout=0.2,
layer_norm=True,
residual_conn=True,
):
def forward(self, x, h=None, c=None, n=None):
def init_hidden(self, batch_size, device=None):
Import
from pytorch_forecasting.layers._recurrent._mlstm.layer import mLSTMLayer
I/O Contract
Inputs
__init__
| Name |
Type |
Required |
Description
|
| input_size |
int |
Yes |
The number of features in the input.
|
| hidden_size |
int |
Yes |
The number of features in the hidden state.
|
| num_layers |
int |
Yes |
The number of mLSTM layers to stack.
|
| dropout |
float |
No |
Dropout probability applied to inputs and intermediate layers. Defaults to 0.2.
|
| layer_norm |
bool |
No |
Whether to use layer normalization in each mLSTM cell. Defaults to True.
|
| residual_conn |
bool |
No |
Whether to enable residual connections between layers. Defaults to True.
|
forward
| Name |
Type |
Required |
Description
|
| x |
torch.Tensor |
Yes |
Input tensor of shape (seq_len, batch_size, input_size). Internally transposed to (batch_size, seq_len, input_size).
|
| h |
torch.Tensor or None |
No |
Initial hidden states for all layers. If None, initialized to zeros.
|
| c |
torch.Tensor or None |
No |
Initial cell states for all layers. If None, initialized to zeros.
|
| n |
torch.Tensor or None |
No |
Initial normalized states for all layers. If None, initialized to zeros.
|
Outputs
forward
| Name |
Type |
Description
|
| output |
torch.Tensor |
Final output tensor from the last layer, of shape (seq_len, batch_size, hidden_size).
|
| (h, c, n) |
tuple of torch.Tensor |
Final hidden, cell, and normalized states for all layers. Each of shape (num_layers, batch_size, hidden_size).
|
init_hidden
| Name |
Type |
Description
|
| (h, c, n) |
tuple of torch.Tensor |
Stacked zero-initialized hidden, cell, and normalization states for all layers.
|
Usage Examples
import torch
from pytorch_forecasting.layers._recurrent._mlstm.layer import mLSTMLayer
layer = mLSTMLayer(
input_size=32,
hidden_size=64,
num_layers=3,
dropout=0.1,
layer_norm=True,
residual_conn=True,
)
seq_len, batch_size = 10, 16
x = torch.randn(seq_len, batch_size, 32)
output, (h, c, n) = layer(x)
# output shape: (10, 16, 64)
# h shape: (3, 16, 64)
Related Pages