Implementation:Sktime Pytorch forecasting ResidualBlock

Knowledge Sources	Sktime_Pytorch_forecasting
Domains	Time_Series, Forecasting, Deep_Learning
Last Updated	2026-02-08 08:00 GMT

Overview

A residual connection block implementing an MLP with one hidden layer, activation function, skip connection, and optional layer normalization.

Description

ResidualBlock is an nn.Module that serves as a basic building block in the DSIPTs architecture. It consists of a direct linear path (skip connection without bias) and a main path with activation, linear transformation, and dropout. The outputs of both paths are summed to form the residual connection, and an optional LayerNorm is applied to the final output.

The block supports configurable input and output dimensions (to handle different stages of the neural network) and a customizable activation function specified as a string that is evaluated via ast.literal_eval (defaulting to nn.ReLU if empty).

Usage

Use this block as a fundamental layer within deep forecasting architectures, particularly the DSIPTs model. It provides stable gradient flow through skip connections and is suitable for stacking multiple layers in feed-forward networks within transformer or MLP-based forecasters.

Code Reference

Source Location

Repository: Sktime_Pytorch_forecasting
File: pytorch_forecasting/layers/_blocks/_residual_block_dsipts.py
Lines: 1-50

Signature

class ResidualBlock(nn.Module):
    def __init__(
        self, in_size: int, out_size: int, dropout_rate: float, activation_fun: str = ""
    )
    def forward(self, x, apply_final_norm=True)

Import

from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock

I/O Contract

Inputs

Name	Type	Required	Description
in_size	int	Yes	Input feature dimension
out_size	int	Yes	Output feature dimension
dropout_rate	float	Yes	Dropout probability applied after the linear transformation
activation_fun	str	No	Activation function as a string to be evaluated (e.g., "nn.GELU"). Defaults to "" which uses nn.ReLU
x	torch.Tensor	Yes	Input tensor for forward pass
apply_final_norm	bool	No	Whether to apply LayerNorm to the output (default: True)

Outputs

Name	Type	Description
output	torch.Tensor	Output tensor of shape (..., out_size) after residual connection and optional layer normalization

Usage Examples

import torch
from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock

# Create a residual block with default ReLU activation
block = ResidualBlock(in_size=256, out_size=256, dropout_rate=0.1)

x = torch.randn(32, 96, 256)
output = block(x)
# output shape: (32, 96, 256)

# Create a residual block with GELU activation and dimension change
block_gelu = ResidualBlock(
    in_size=512, out_size=256, dropout_rate=0.2, activation_fun="nn.GELU"
)

x = torch.randn(32, 96, 512)
output = block_gelu(x)
# output shape: (32, 96, 256)

# Forward without final layer normalization
output_no_norm = block(torch.randn(32, 96, 256), apply_final_norm=False)

Related Pages

Principle:Sktime_Pytorch_forecasting_Residual_Connection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment