Principle:Sktime Pytorch forecasting Tensor Utilities

Knowledge Sources	pytorch-forecasting
Domains	Time_Series, Forecasting, Deep_Learning, Utilities
Last Updated	2026-02-08 09:00 GMT

Overview

A collection of core utility functions for tensor manipulation, including boolean mask creation, variable-length sequence padding, device management, masked aggregation, RNN sequence unpacking, autocorrelation computation, and structured output formatting. These utilities underpin the data processing and model output handling across the entire pytorch-forecasting library.

Description

The tensor utilities module provides foundational operations that are used throughout pytorch-forecasting's data pipeline, model internals, and metric computations. The key utilities are:

Mask Creation (create_mask): Generates boolean masks of shape (batch_size, size) from a tensor of sequence lengths. Entry (i, j) is True if position j is beyond the valid length for sample i. An inverse flag flips the semantics so that True indicates valid positions. This is essential for handling variable-length sequences in batched training.

Padded Stacking (padded_stack): Stacks a list of tensors with potentially different sizes along the last dimension by padding shorter tensors to the maximum length. Supports left or right padding with constant, reflect, replicate, or circular modes. This is critical for collating variable-length time series into uniform batches.

Device Management (move_to_device): Recursively moves tensors within nested structures (dicts, lists, tuples, namedtuples) to a target device. Includes special handling for Apple MPS devices, falling back to CPU if MPS is unavailable. This ensures correct device placement across heterogeneous hardware.

Detach (detach): Recursively detaches tensors from the computation graph within nested structures, preserving the container types (dict, list, tuple, namedtuple). Used to prevent gradient flow during inference and metric logging.

Masked Operations (masked_op): Computes mean or sum over a tensor while respecting a boolean mask. NaN values are masked by default. This is used in metrics and normalization where some time steps may be invalid.

Sequence Unpacking (unpack_sequence): Converts RNN PackedSequence objects to regular padded tensors with accompanying length vectors, or creates uniform length vectors for already-padded tensors. This bridges the gap between PyTorch RNN internals and the library's metric and loss computations.

Sequence Concatenation (concat_sequences): Concatenates lists of tensors or PackedSequence objects, handling nested tuple/list structures recursively. Used when combining encoder and decoder sequences.

Autocorrelation: Computes the autocorrelation function of a tensor along a given dimension using the FFT-based Wiener-Khinchin method. The implementation finds the next FFT-efficient length (factors of 2, 3, 5), applies zero-padded FFT, computes the power spectrum, and inverts back to the time domain. Used for time series analysis and diagnostic purposes.

Output Formatting (OutputMixIn, TupleOutputMixIn): Provides dictionary-like access to namedtuple model outputs. OutputMixIn adds __getitem__, get, items, and keys methods. TupleOutputMixIn provides a to_network_output method that converts keyword arguments into immutable namedtuples with dictionary-style access, enabling both traceability (for torch.jit) and convenient attribute access.

Additional Helpers: integer_histogram creates histograms of integer data in a specified range; groupby_apply performs groupby mean or sum on tensor data; get_embedding_size uses the fastai heuristic $\min (1.6 \cdot n^{0.56}, 100)$ to determine embedding dimensions; to_list ensures values are wrapped in lists; unsqueeze_like broadcasts a tensor to match another tensor's dimensionality; apply_to_list maps a function over list elements or applies it directly to a scalar.

Usage

Use create_mask and padded_stack when building custom data collation or processing variable-length sequences. Use move_to_device when transferring complex batch structures between CPU and GPU. Use masked_op when computing statistics over sequences with missing or padded values. Use unpack_sequence within metrics that need to handle both PackedSequence and standard tensor inputs. Use TupleOutputMixIn when defining model output structures that should be both immutable and dictionary-accessible.

Theoretical Basis

Boolean Mask from Lengths:

$mask [i, j] = {\begin{cases} True & if j \geq L_{i} \\ False & if j < L_{i} \end{cases}$

where $L_{i}$ is the valid sequence length for sample $i$ . With the inverse flag, the condition is reversed to $j < L_{i}$ .

Padded Stack:

# Pseudo-code: stack variable-length tensors with right padding
max_len = max(len(t) for t in tensors)
for each tensor t:
    if len(t) < max_len:
        t = pad(t, right=(max_len - len(t)), value=0)
result = stack(padded_tensors, dim=0)

FFT-Based Autocorrelation (Wiener-Khinchin Theorem):

$R_{x x} [τ] = ℱ^{- 1} {| ℱ {x - \bar{x}} |^{2}} [τ]$

The implementation zero-pads to $2 M$ where $M$ is the next efficient FFT length, computes the forward FFT, squares the magnitude, applies the inverse FFT, and normalizes by the number of overlapping samples at each lag.

Embedding Size Heuristic (fastai):

$d_{embed} = \min (⌊ 1.6 \cdot n^{0.56} ⌉, 100)$

where $n$ is the number of unique categories. For $n \leq 2$ , the embedding size is set to 1.

Masked Mean:

${\bar{x}}_{masked} = \frac{\sum_{i} x_{i} \cdot m_{i}}{\sum_{i} m_{i}}$

where $m_{i} \in {0, 1}$ is the mask indicator.

Related Pages

Implemented By

Implementation:Sktime_Pytorch_forecasting_Utils

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment