Principle:Sktime Pytorch forecasting Time Series Data Loading

Knowledge Sources	PyTorch Forecasting Docs pytorch-forecasting
Domains	Time_Series, Data_Engineering
Last Updated	2026-02-08 07:00 GMT

Overview

Technique for loading and preparing real-world tabular time series data with covariates for forecasting model consumption.

Description

Time Series Data Loading is the first step in any forecasting pipeline. It involves reading raw data from storage (CSV, Parquet, databases), ensuring the data has the correct schema (time index, group identifiers, target variables, covariates), and performing initial feature engineering such as creating time-based features, log transforms, and rolling aggregates. In the context of demand forecasting, this typically means loading historical sales or volume data alongside static metadata (product IDs, store locations) and dynamic covariates (promotions, holidays, price changes). The quality and structure of loaded data directly determines the success of downstream model training.

Usage

Use this principle at the beginning of any forecasting workflow when working with real-world tabular datasets that contain multiple time series identified by group columns (e.g., agency + SKU combinations). This is the appropriate starting point when the data comes with mixed covariates: static categoricals, time-varying known reals, and time-varying unknown targets. It is not needed when generating synthetic data for experimentation.

Theoretical Basis

Data loading for time series forecasting follows a specific schema requirement:

Required columns:

Time index — monotonically increasing integer identifying time position
Group identifiers — one or more columns that uniquely identify each individual series
Target variable — the value to forecast (e.g., sales volume)

Optional covariates:

Static categoricals — time-invariant categorical features (e.g., product type)
Static reals — time-invariant continuous features (e.g., store size)
Time-varying known — future-known features (e.g., holidays, promotions)
Time-varying unknown — features known only in the past (e.g., lagged target)

Pseudo-code logic:

# Abstract data loading pipeline
raw_data = load_from_storage(path)
data = add_time_index(raw_data, date_column)
data = add_group_identifiers(data, group_columns)
data = engineer_features(data)  # log transforms, rolling stats, etc.
# Result: DataFrame ready for TimeSeriesDataSet construction

Related Pages

Implemented By

Implementation:Sktime_Pytorch_forecasting_Get_Stallion_Data

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment