Implementation:Sktime Pytorch forecasting ResidualBlock
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Forecasting, Deep_Learning |
| Last Updated | 2026-02-08 08:00 GMT |
Overview
A residual connection block implementing an MLP with one hidden layer, activation function, skip connection, and optional layer normalization.
Description
ResidualBlock is an nn.Module that serves as a basic building block in the DSIPTs architecture. It consists of a direct linear path (skip connection without bias) and a main path with activation, linear transformation, and dropout. The outputs of both paths are summed to form the residual connection, and an optional LayerNorm is applied to the final output.
The block supports configurable input and output dimensions (to handle different stages of the neural network) and a customizable activation function specified as a string that is evaluated via ast.literal_eval (defaulting to nn.ReLU if empty).
Usage
Use this block as a fundamental layer within deep forecasting architectures, particularly the DSIPTs model. It provides stable gradient flow through skip connections and is suitable for stacking multiple layers in feed-forward networks within transformer or MLP-based forecasters.
Code Reference
Source Location
- Repository: Sktime_Pytorch_forecasting
- File: pytorch_forecasting/layers/_blocks/_residual_block_dsipts.py
- Lines: 1-50
Signature
class ResidualBlock(nn.Module):
def __init__(
self, in_size: int, out_size: int, dropout_rate: float, activation_fun: str = ""
)
def forward(self, x, apply_final_norm=True)
Import
from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| in_size | int | Yes | Input feature dimension |
| out_size | int | Yes | Output feature dimension |
| dropout_rate | float | Yes | Dropout probability applied after the linear transformation |
| activation_fun | str | No | Activation function as a string to be evaluated (e.g., "nn.GELU"). Defaults to "" which uses nn.ReLU |
| x | torch.Tensor | Yes | Input tensor for forward pass |
| apply_final_norm | bool | No | Whether to apply LayerNorm to the output (default: True) |
Outputs
| Name | Type | Description |
|---|---|---|
| output | torch.Tensor | Output tensor of shape (..., out_size) after residual connection and optional layer normalization |
Usage Examples
import torch
from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock
# Create a residual block with default ReLU activation
block = ResidualBlock(in_size=256, out_size=256, dropout_rate=0.1)
x = torch.randn(32, 96, 256)
output = block(x)
# output shape: (32, 96, 256)
# Create a residual block with GELU activation and dimension change
block_gelu = ResidualBlock(
in_size=512, out_size=256, dropout_rate=0.2, activation_fun="nn.GELU"
)
x = torch.randn(32, 96, 512)
output = block_gelu(x)
# output shape: (32, 96, 256)
# Forward without final layer normalization
output_no_norm = block(torch.randn(32, 96, 256), apply_final_norm=False)