Principle:Sktime Pytorch forecasting DataLoader Creation
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Data_Engineering, Deep_Learning |
| Last Updated | 2026-02-08 07:00 GMT |
Overview
Technique for converting a TimeSeriesDataSet into batched, shuffled DataLoader iterators suitable for neural network training and evaluation.
Description
DataLoader Creation wraps a PyTorch Dataset into a DataLoader that handles batching, shuffling, and parallel data loading. For time series forecasting, the DataLoader must handle the specific tensor structure produced by TimeSeriesDataSet — dictionaries of encoder/decoder tensors rather than simple (input, label) pairs. The library provides a convenience method that configures appropriate defaults: shuffling and dropping the last incomplete batch for training, sequential sampling for validation, and optional time-synchronized batching for models that require temporally aligned samples within a batch.
Usage
Use this principle after constructing both training and validation TimeSeriesDataSets. Every model training workflow requires DataLoaders as input to Trainer.fit(). The batch size is a critical hyperparameter affecting memory usage and training dynamics.
Theoretical Basis
Mini-batch stochastic gradient descent requires iterating over the dataset in randomly shuffled batches:
Where is a mini-batch sampled from the dataset.
Time-synchronized batching is a special mode where all samples in a batch are aligned in time (same decoder time indices). This is useful for models that exploit cross-series information within a batch, like hierarchical models.
Pseudo-code:
# Abstract DataLoader creation
train_loader = create_dataloader(
dataset=training_dataset,
batch_size=64,
shuffle=True,
drop_last=True,
)
val_loader = create_dataloader(
dataset=val_dataset,
batch_size=64,
shuffle=False,
drop_last=False,
)