Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sktime Pytorch forecasting ResidualBlock

From Leeroopedia


Knowledge Sources
Domains Time_Series, Forecasting, Deep_Learning
Last Updated 2026-02-08 08:00 GMT

Overview

A residual connection block implementing an MLP with one hidden layer, activation function, skip connection, and optional layer normalization.

Description

ResidualBlock is an nn.Module that serves as a basic building block in the DSIPTs architecture. It consists of a direct linear path (skip connection without bias) and a main path with activation, linear transformation, and dropout. The outputs of both paths are summed to form the residual connection, and an optional LayerNorm is applied to the final output.

The block supports configurable input and output dimensions (to handle different stages of the neural network) and a customizable activation function specified as a string that is evaluated via ast.literal_eval (defaulting to nn.ReLU if empty).

Usage

Use this block as a fundamental layer within deep forecasting architectures, particularly the DSIPTs model. It provides stable gradient flow through skip connections and is suitable for stacking multiple layers in feed-forward networks within transformer or MLP-based forecasters.

Code Reference

Source Location

Signature

class ResidualBlock(nn.Module):
    def __init__(
        self, in_size: int, out_size: int, dropout_rate: float, activation_fun: str = ""
    )
    def forward(self, x, apply_final_norm=True)

Import

from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock

I/O Contract

Inputs

Name Type Required Description
in_size int Yes Input feature dimension
out_size int Yes Output feature dimension
dropout_rate float Yes Dropout probability applied after the linear transformation
activation_fun str No Activation function as a string to be evaluated (e.g., "nn.GELU"). Defaults to "" which uses nn.ReLU
x torch.Tensor Yes Input tensor for forward pass
apply_final_norm bool No Whether to apply LayerNorm to the output (default: True)

Outputs

Name Type Description
output torch.Tensor Output tensor of shape (..., out_size) after residual connection and optional layer normalization

Usage Examples

import torch
from pytorch_forecasting.layers._blocks._residual_block_dsipts import ResidualBlock

# Create a residual block with default ReLU activation
block = ResidualBlock(in_size=256, out_size=256, dropout_rate=0.1)

x = torch.randn(32, 96, 256)
output = block(x)
# output shape: (32, 96, 256)

# Create a residual block with GELU activation and dimension change
block_gelu = ResidualBlock(
    in_size=512, out_size=256, dropout_rate=0.2, activation_fun="nn.GELU"
)

x = torch.randn(32, 96, 512)
output = block_gelu(x)
# output shape: (32, 96, 256)

# Forward without final layer normalization
output_no_norm = block(torch.randn(32, 96, 256), apply_final_norm=False)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment