Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Sktime Pytorch forecasting Categorical Variable Embedding

From Leeroopedia
Revision as of 17:29, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/Sktime_Pytorch_forecasting_Categorical_Variable_Embedding.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Time_Series, Forecasting, Deep_Learning, Embedding, Feature_Engineering
Last Updated 2026-02-08 09:00 GMT

Overview

Embedding network for categorical variables that maps each discrete variable through a separate learned embedding table and augments the input with positional and temporal indicator variables, producing a dense feature tensor for downstream forecasting models.

Description

Categorical Variable Embedding transforms discrete categorical inputs into dense, continuous vector representations suitable for neural network processing. Each categorical variable is assigned its own nn.Embedding layer with a vocabulary size matching the cardinality of that variable and a shared output dimension (d_model).

A distinctive feature of this module is automatic positional augmentation. Three additional positional categorical variables are generated on the fly and concatenated with the user-provided categorical variables before embedding:

1. pos_seq (Sequence Position): An integer index from 0 to seq_len - 1 assigned to each time step, encoding absolute position within the full sequence (past + future).

2. pos_fut (Future Position): A counter that is 0 for all past time steps and counts from 1 to lag for future steps, encoding relative position within the forecast horizon.

3. is_fut (Future Indicator): A binary flag (0 for past, 1 for future) that explicitly marks whether each time step belongs to the historical context or the prediction horizon.

Each variable (both user-provided and positional) is independently embedded through its own embedding layer, then the embeddings are stacked along a new dimension, yielding a 4D tensor of shape (batch, seq_len, num_vars + 3, d_model). This structure allows downstream layers to attend separately over the variable dimension and the time dimension.

The module handles the edge case where no external categorical variables are provided (input is an integer batch size), in which case only the three positional variables are embedded.

Usage

Use embedding_cat_variables in architectures like DSIPTs that require rich categorical feature representations. Provide emb_dims as a list of vocabulary sizes for each user-defined categorical variable. The seq_len and lag parameters define the total sequence length and forecast horizon, which determine the vocabulary sizes of the three auto-generated positional variables. The output tensor can be summed, concatenated, or attended over in subsequent layers.

Theoretical Basis

Embedding Lookup:

For a categorical variable c with vocabulary size Vc:

ec=Embedding(c)dmodel

Where the embedding table WcVc×dmodel is learned during training.

Positional Augmentation Variables:

Failed to parse (syntax error): {\displaystyle \text{pos\_seq}_t = t, \quad t \in \{0, 1, \ldots, T-1\} }

Failed to parse (syntax error): {\displaystyle \text{pos\_fut}_t = \begin{cases} 0 & \text{if } t < T - H \\ t - (T - H) + 1 & \text{if } t \geq T - H \end{cases} }

Failed to parse (syntax error): {\displaystyle \text{is\_fut}_t = \begin{cases} 0 & \text{if } t < T - H \\ 1 & \text{if } t \geq T - H \end{cases} }

Where T is the total sequence length and H is the forecast horizon (lag).

Combined Output:

Given M user-defined categorical variables plus 3 positional variables:

Failed to parse (syntax error): {\displaystyle E = \text{Stack}(e_1, e_2, \ldots, e_M, e_{\text{pos\_seq}}, e_{\text{pos\_fut}}, e_{\text{is\_fut}}) \in \mathbb{R}^{B \times T \times (M+3) \times d_{\text{model}}} }

Pseudo-code:

# Categorical variable embedding (pseudo-code)
def embed_categorical(x, seq_len, lag):
    pos_seq = arange(0, seq_len)
    pos_fut = concat(zeros(seq_len - lag), arange(1, lag + 1))
    is_fut = concat(zeros(seq_len - lag), ones(lag))

    cat_vars = concat(x, pos_seq, pos_fut, is_fut, dim=-1)

    embeddings = []
    for i, embed_layer in enumerate(embedding_layers):
        embeddings.append(embed_layer(cat_vars[:, :, i]))
    return stack(embeddings, dim=2)

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment