Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sktime Pytorch forecasting TimeSeriesDataSet From Dataset

From Leeroopedia


Knowledge Sources
Domains Time_Series, Data_Engineering, Model_Evaluation
Last Updated 2026-02-08 07:00 GMT

Overview

Concrete tool for creating validation or test TimeSeriesDataSets by cloning configuration from an existing training dataset.

Description

The TimeSeriesDataSet.from_dataset class method constructs a new TimeSeriesDataSet using the same variable encoders, scalers, normalizers, and configuration as a source dataset but with different data. It calls get_parameters() on the source dataset to extract all constructor arguments (including fitted encoders and scalers), then passes them along with the new data to the constructor. This ensures consistency between training and evaluation data processing. Optional overrides can be passed via keyword arguments.

Usage

Use this method after constructing a training TimeSeriesDataSet to create a matching validation dataset. Typically called with stop_randomization=True for validation sets and optionally predict=True for single-prediction-per-group evaluation. Used in all four workflows.

Code Reference

Source Location

  • Repository: pytorch-forecasting
  • File: pytorch_forecasting/data/timeseries/_timeseries.py
  • Lines: L1641-1682

Signature

@classmethod
def from_dataset(
    cls: type[TimeSeriesDataType],
    dataset: TimeSeriesDataType,
    data: pd.DataFrame,
    stop_randomization: bool = False,
    predict: bool = False,
    **update_kwargs,
) -> TimeSeriesDataType:
    """
    Construct dataset with different data, same variable encoders, scalers, etc.

    Parameters
    ----------
    dataset : TimeSeriesDataSet
        dataset from which to copy parameters
    data : pd.DataFrame
        data from which new dataset will be generated
    stop_randomization : bool, optional, default=False
        Whether to stop randomizing encoder and decoder lengths
    predict : bool, optional, default=False
        Whether to predict on last entries in time index only
    **update_kwargs
        keyword arguments overrides passed to constructor

    Returns
    -------
    TimeSeriesDataSet
    """

Import

from pytorch_forecasting import TimeSeriesDataSet
# Then call: TimeSeriesDataSet.from_dataset(training, data, ...)

I/O Contract

Inputs

Name Type Required Description
dataset TimeSeriesDataSet Yes Source dataset to copy configuration from
data pd.DataFrame Yes New data for the dataset (e.g., full DataFrame including validation period)
stop_randomization bool No Disable length randomization for validation (default: False)
predict bool No Use only last entries per group for prediction (default: False)
**update_kwargs dict No Override any constructor parameter

Outputs

Name Type Description
return TimeSeriesDataSet New dataset with same encoders/scalers but different data

Usage Examples

Create Validation Dataset

from pytorch_forecasting import TimeSeriesDataSet

# Assume 'training' is an existing TimeSeriesDataSet and 'data' is the full DataFrame
validation = TimeSeriesDataSet.from_dataset(
    training,
    data,
    stop_randomization=True,
    predict=False,
)

print(f"Training samples: {len(training)}")
print(f"Validation samples: {len(validation)}")

N-BEATS Validation with Min Prediction Index

# For N-BEATS, restrict validation to predictions after training cutoff
validation = TimeSeriesDataSet.from_dataset(
    training,
    data,
    min_prediction_idx=training_cutoff + 1,
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment