Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sktime Pytorch forecasting NaNLabelEncoder

From Leeroopedia


Knowledge Sources
Domains Time_Series, Data_Engineering, Preprocessing
Last Updated 2026-02-08 07:00 GMT

Overview

Concrete tool for encoding categorical variables to integers with NaN and unknown class handling provided by the pytorch-forecasting library.

Description

The NaNLabelEncoder class is a scikit-learn-compatible label encoder that gracefully handles NaN values and unknown categories. When add_nan=True, NaN is always encoded as class 0. Unknown categories encountered during transform (not seen during fit) are also mapped to class 0 with an optional warning. The encoder supports both string and numeric pandas Series, provides fit, transform, fit_transform, and inverse_transform methods, and tracks the class-to-index mapping in its classes_ attribute.

Usage

Use NaNLabelEncoder when: (1) pre-fitting group ID encodings before TimeSeriesDataSet construction (to ensure consistency between train and validation datasets), or (2) when TimeSeriesDataSet auto-fits encoders for categorical columns. The encoder is passed via the categorical_encoders parameter as a dict mapping column names to pre-fitted encoder instances.

Code Reference

Source Location

Signature

class NaNLabelEncoder(BaseEstimator, TransformerMixin):
    def __init__(self, add_nan: bool = False, warn: bool = True):
        """
        Label encoder with NaN handling.

        Parameters
        ----------
        add_nan : bool, optional, default=False
            If to force encoding of NaN at index 0.
        warn : bool, optional, default=True
            If to warn when unknown items are encoded as NaN.
        """

    def fit(self, y: pd.Series, overwrite: bool = False) -> "NaNLabelEncoder":
        """Fit encoder to data."""

    def transform(self, y: pd.Series) -> np.ndarray:
        """Transform categories to integer indices."""

    def fit_transform(self, y: pd.Series, overwrite: bool = False) -> np.ndarray:
        """Fit and transform in one step."""

    def inverse_transform(self, y: np.ndarray) -> np.ndarray:
        """Convert integer indices back to original categories."""

Import

from pytorch_forecasting.data.encoders import NaNLabelEncoder

I/O Contract

Inputs

Name Type Required Description
add_nan bool No Force NaN encoding at index 0 (default: False)
warn bool No Warn on unknown categories (default: True)
y pd.Series Yes (to fit/transform) Categorical data series to encode

Outputs

Name Type Description
transform() np.ndarray Integer-encoded array
inverse_transform() np.ndarray Original category values
classes_ dict Mapping from category to integer index

Usage Examples

Pre-fit Encoder for DeepAR

from pytorch_forecasting.data.encoders import NaNLabelEncoder

# Pre-fit encoder on full data to ensure consistency
encoder = NaNLabelEncoder().fit(data["series"])

training = TimeSeriesDataSet(
    data[lambda x: x.time_idx <= training_cutoff],
    time_idx="time_idx",
    target="value",
    group_ids=["series"],
    categorical_encoders={"series": encoder},
    # ... other params
)

# Validation dataset will use the same encoder via from_dataset
validation = TimeSeriesDataSet.from_dataset(training, data)

Inspect Encoder Mapping

encoder = NaNLabelEncoder(add_nan=True).fit(data["category"])
print(f"Classes: {encoder.classes_}")
# {nan: 0, 'A': 1, 'B': 2, 'C': 3}

encoded = encoder.transform(data["category"])
decoded = encoder.inverse_transform(encoded)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment