Principle:Sktime Pytorch forecasting Categorical Encoding

Knowledge Sources	pytorch-forecasting PyTorch Forecasting Docs
Domains	Time_Series, Data_Engineering, Preprocessing
Last Updated	2026-02-08 07:00 GMT

Overview

Technique for converting categorical string or mixed-type variables into integer indices suitable for neural network embedding layers.

Description

Categorical Encoding maps each unique category in a variable to an integer index. Neural networks cannot process string values directly; they require integer indices that are then mapped to dense vector representations via embedding layers. The encoding must handle edge cases: NaN values (common in real-world time series), unknown categories at inference time (categories not seen during training), and mixed numeric/string types. A robust label encoder provides a consistent mapping that is fitted on training data and safely applied to validation/test data, mapping unknown categories to a special index (typically 0).

Usage

Use categorical encoding when preparing time series data with categorical features (e.g., product IDs, store names, day-of-week). In pytorch-forecasting, categorical encoders are either auto-fitted during TimeSeriesDataSet construction or pre-fitted and passed via the categorical_encoders parameter. Pre-fitting is required when the group ID column needs consistent encoding across training and validation datasets.

Theoretical Basis

Label encoding:

Failed to parse (unknown function "\begin{cases}"): {\displaystyle \text{encode}(c) = \begin{cases} \text{mapping}[c] & \text{if } c \in \text{known\_classes} \\ 0 & \text{if } c \notin \text{known\_classes or } c = \text{NaN} \end{cases} }

With NaN handling:

# Abstract encoding pipeline
mapping = {}
if add_nan:
    mapping[NaN] = 0
for idx, category in enumerate(sorted(unique_categories)):
    mapping[category] = idx + (1 if add_nan else 0)

# Transform: map categories to integers
encoded = [mapping.get(c, 0) for c in data]  # unknown -> 0

Embedding lookup: The integer indices feed into an embedding layer: $e_{c} = W_{embed} [encode (c)] \in ℝ^{d}$

Where $W_{embed} \in ℝ^{| C | \times d}$ is the learned embedding matrix.

Related Pages

Implemented By

Implementation:Sktime_Pytorch_forecasting_NaNLabelEncoder

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment