Implementation:Sktime Pytorch forecasting TslibDataModule

Knowledge Sources	Sktime_Pytorch_forecasting
Domains	Time_Series, Forecasting, Deep_Learning
Last Updated	2026-02-08 08:00 GMT

Overview

TslibDataModule is an experimental Lightning DataModule that bridges the v2 TimeSeries dataset to tslib-style deep learning models through sliding-window batching.

Description

TslibDataModule extends LightningDataModule and serves as the D2 (data processing) layer for tslib-derived models such as Informer, AutoFormer, and TimeXer. It takes a TimeSeries dataset (D1 layer) and produces train, validation, test, and prediction DataLoaders with sliding-window batches. The module computes metadata describing feature names, indices, types (categorical/continuous), known/unknown status, and forecast horizons. An internal _TslibDataset class handles individual window retrieval, splitting features into history/future and continuous/categorical components, applying known-feature masking for future windows, and producing (x, y) tuples for the model. The module supports configurable context/prediction lengths, window stride, target normalization, and train/val/test splitting.

Usage

Use TslibDataModule when training v2 pytorch-forecasting models (those extending TslibBaseModel) with the new TimeSeries data pipeline. Pass a TimeSeries dataset instance along with context_length and prediction_length to create a fully configured data module.

Code Reference

Source Location

Repository: Sktime_Pytorch_forecasting
File: pytorch_forecasting/data/_tslib_data_module.py
Lines: 1-892

Signature: _TslibDataset

class _TslibDataset(Dataset):
    def __init__(
        self,
        dataset: TimeSeries,
        data_module: "TslibDataModule",
        windows: list[tuple[int, int, int, int]],
        add_relative_time_idx: bool = False,
    ):

Signature: TslibDataModule

class TslibDataModule(LightningDataModule):
    def __init__(
        self,
        time_series_dataset: TimeSeries,
        context_length: int,
        prediction_length: int,
        freq: str = "h",
        add_relative_time_idx: bool = False,
        add_target_scales: bool = False,
        target_normalizer: NORMALIZER
        | str
        | list[NORMALIZER]
        | tuple[NORMALIZER]
        | None = "auto",
        scalers: dict[
            str, StandardScaler | RobustScaler | TorchNormalizer | EncoderNormalizer
        ]
        | None = None,
        shuffle: bool = True,
        window_stride: int = 1,
        batch_size: int = 32,
        num_workers: int = 0,
        train_val_test_split: tuple[float, float, float] = (0.7, 0.15, 0.15),
        collate_fn: Callable | None = None,
        **kwargs,
    ) -> None:

setup

def setup(self, stage: str | None = None) -> None:

Import

from pytorch_forecasting.data._tslib_data_module import TslibDataModule

I/O Contract

Constructor Inputs

Name	Type	Required	Description
time_series_dataset	TimeSeries	Yes	The v2 TimeSeries dataset (D1 layer)
context_length	int	Yes	Number of historical time steps used as model input
prediction_length	int	Yes	Number of future time steps to predict
freq	str	No	Frequency of the time series data (default 'h')
add_relative_time_idx	bool	No	Whether to add relative time indices (default False)
add_target_scales	bool	No	Whether to add target scaling info (default False)
target_normalizer	NORMALIZER or str or None	No	Target normalizer; 'auto' uses RobustScaler (default 'auto')
scalers	dict or None	No	Dictionary of feature scalers (default None)
shuffle	bool	No	Whether to shuffle training data (default True)
window_stride	int	No	Stride for the sliding window (default 1)
batch_size	int	No	Batch size for dataloaders (default 32)
num_workers	int	No	Number of dataloader workers (default 0)
train_val_test_split	tuple[float, float, float]	No	Proportions for train/val/test splits (default (0.7, 0.15, 0.15))
collate_fn	Callable or None	No	Custom collate function for the dataloader

Batch Output (x dict)

Name	Type	Description
history_cont	torch.Tensor	Continuous features for the encoder, shape (batch, context_length, n_cont)
history_cat	torch.Tensor	Categorical features for the encoder, shape (batch, context_length, n_cat)
future_cont	torch.Tensor	Known continuous features for decoder, shape (batch, prediction_length, n_known_cont)
future_cat	torch.Tensor	Known categorical features for decoder, shape (batch, prediction_length, n_known_cat)
history_target	torch.Tensor	Historical target values, shape (batch, context_length, n_targets)
future_target	torch.Tensor	Future target values, shape (batch, prediction_length, n_targets)
history_mask	torch.Tensor	Boolean mask for valid encoder time points
future_mask	torch.Tensor	Boolean mask for valid decoder time points
groups	torch.Tensor	Group identifiers
history_time_idx	torch.Tensor	Time indices for encoder
future_time_idx	torch.Tensor	Time indices for decoder

metadata Property Output

Name	Type	Description
feature_names	dict[str, list[str]]	Feature names grouped by type (categorical, continuous, static, known, unknown, target, all)
feature_indices	dict[str, list[int]]	Feature indices grouped by type
n_features	dict[str, int]	Feature counts by type
context_length	int	Context window length
prediction_length	int	Prediction horizon length
freq	str	Time series frequency
features	str	Feature mode (S, MS, or M)

Usage Examples

from pytorch_forecasting.data._tslib_data_module import TslibDataModule
from pytorch_forecasting.data.timeseries import TimeSeries

# Create TimeSeries dataset (D1 layer)
ts = TimeSeries(data=df, time="time_idx", target="value", group=["series_id"])

# Create TslibDataModule (D2 layer)
dm = TslibDataModule(
    time_series_dataset=ts,
    context_length=96,
    prediction_length=24,
    batch_size=64,
    num_workers=4,
    train_val_test_split=(0.7, 0.15, 0.15),
)

dm.setup(stage="fit")
train_loader = dm.train_dataloader()
val_loader = dm.val_dataloader()

# Access metadata for model initialization
metadata = dm.metadata

Related Pages

Principle:Sktime_Pytorch_forecasting_V2_Data_Pipeline

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment