Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sktime Pytorch forecasting TslibDataModule

From Leeroopedia


Knowledge Sources
Domains Time_Series, Forecasting, Deep_Learning
Last Updated 2026-02-08 08:00 GMT

Overview

TslibDataModule is an experimental Lightning DataModule that bridges the v2 TimeSeries dataset to tslib-style deep learning models through sliding-window batching.

Description

TslibDataModule extends LightningDataModule and serves as the D2 (data processing) layer for tslib-derived models such as Informer, AutoFormer, and TimeXer. It takes a TimeSeries dataset (D1 layer) and produces train, validation, test, and prediction DataLoaders with sliding-window batches. The module computes metadata describing feature names, indices, types (categorical/continuous), known/unknown status, and forecast horizons. An internal _TslibDataset class handles individual window retrieval, splitting features into history/future and continuous/categorical components, applying known-feature masking for future windows, and producing (x, y) tuples for the model. The module supports configurable context/prediction lengths, window stride, target normalization, and train/val/test splitting.

Usage

Use TslibDataModule when training v2 pytorch-forecasting models (those extending TslibBaseModel) with the new TimeSeries data pipeline. Pass a TimeSeries dataset instance along with context_length and prediction_length to create a fully configured data module.

Code Reference

Source Location

Signature: _TslibDataset

class _TslibDataset(Dataset):
    def __init__(
        self,
        dataset: TimeSeries,
        data_module: "TslibDataModule",
        windows: list[tuple[int, int, int, int]],
        add_relative_time_idx: bool = False,
    ):

Signature: TslibDataModule

class TslibDataModule(LightningDataModule):
    def __init__(
        self,
        time_series_dataset: TimeSeries,
        context_length: int,
        prediction_length: int,
        freq: str = "h",
        add_relative_time_idx: bool = False,
        add_target_scales: bool = False,
        target_normalizer: NORMALIZER
        | str
        | list[NORMALIZER]
        | tuple[NORMALIZER]
        | None = "auto",
        scalers: dict[
            str, StandardScaler | RobustScaler | TorchNormalizer | EncoderNormalizer
        ]
        | None = None,
        shuffle: bool = True,
        window_stride: int = 1,
        batch_size: int = 32,
        num_workers: int = 0,
        train_val_test_split: tuple[float, float, float] = (0.7, 0.15, 0.15),
        collate_fn: Callable | None = None,
        **kwargs,
    ) -> None:

setup

def setup(self, stage: str | None = None) -> None:

Import

from pytorch_forecasting.data._tslib_data_module import TslibDataModule

I/O Contract

Constructor Inputs

Name Type Required Description
time_series_dataset TimeSeries Yes The v2 TimeSeries dataset (D1 layer)
context_length int Yes Number of historical time steps used as model input
prediction_length int Yes Number of future time steps to predict
freq str No Frequency of the time series data (default 'h')
add_relative_time_idx bool No Whether to add relative time indices (default False)
add_target_scales bool No Whether to add target scaling info (default False)
target_normalizer NORMALIZER or str or None No Target normalizer; 'auto' uses RobustScaler (default 'auto')
scalers dict or None No Dictionary of feature scalers (default None)
shuffle bool No Whether to shuffle training data (default True)
window_stride int No Stride for the sliding window (default 1)
batch_size int No Batch size for dataloaders (default 32)
num_workers int No Number of dataloader workers (default 0)
train_val_test_split tuple[float, float, float] No Proportions for train/val/test splits (default (0.7, 0.15, 0.15))
collate_fn Callable or None No Custom collate function for the dataloader

Batch Output (x dict)

Name Type Description
history_cont torch.Tensor Continuous features for the encoder, shape (batch, context_length, n_cont)
history_cat torch.Tensor Categorical features for the encoder, shape (batch, context_length, n_cat)
future_cont torch.Tensor Known continuous features for decoder, shape (batch, prediction_length, n_known_cont)
future_cat torch.Tensor Known categorical features for decoder, shape (batch, prediction_length, n_known_cat)
history_target torch.Tensor Historical target values, shape (batch, context_length, n_targets)
future_target torch.Tensor Future target values, shape (batch, prediction_length, n_targets)
history_mask torch.Tensor Boolean mask for valid encoder time points
future_mask torch.Tensor Boolean mask for valid decoder time points
groups torch.Tensor Group identifiers
history_time_idx torch.Tensor Time indices for encoder
future_time_idx torch.Tensor Time indices for decoder

metadata Property Output

Name Type Description
feature_names dict[str, list[str]] Feature names grouped by type (categorical, continuous, static, known, unknown, target, all)
feature_indices dict[str, list[int]] Feature indices grouped by type
n_features dict[str, int] Feature counts by type
context_length int Context window length
prediction_length int Prediction horizon length
freq str Time series frequency
features str Feature mode (S, MS, or M)

Usage Examples

from pytorch_forecasting.data._tslib_data_module import TslibDataModule
from pytorch_forecasting.data.timeseries import TimeSeries

# Create TimeSeries dataset (D1 layer)
ts = TimeSeries(data=df, time="time_idx", target="value", group=["series_id"])

# Create TslibDataModule (D2 layer)
dm = TslibDataModule(
    time_series_dataset=ts,
    context_length=96,
    prediction_length=24,
    batch_size=64,
    num_workers=4,
    train_val_test_split=(0.7, 0.15, 0.15),
)

dm.setup(stage="fit")
train_loader = dm.train_dataloader()
val_loader = dm.val_dataloader()

# Access metadata for model initialization
metadata = dm.metadata

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment