Implementation:Sktime Pytorch forecasting TimeSeriesDataSet From Dataset
| Knowledge Sources | |
|---|---|
| Domains | Time_Series, Data_Engineering, Model_Evaluation |
| Last Updated | 2026-02-08 07:00 GMT |
Overview
Concrete tool for creating validation or test TimeSeriesDataSets by cloning configuration from an existing training dataset.
Description
The TimeSeriesDataSet.from_dataset class method constructs a new TimeSeriesDataSet using the same variable encoders, scalers, normalizers, and configuration as a source dataset but with different data. It calls get_parameters() on the source dataset to extract all constructor arguments (including fitted encoders and scalers), then passes them along with the new data to the constructor. This ensures consistency between training and evaluation data processing. Optional overrides can be passed via keyword arguments.
Usage
Use this method after constructing a training TimeSeriesDataSet to create a matching validation dataset. Typically called with stop_randomization=True for validation sets and optionally predict=True for single-prediction-per-group evaluation. Used in all four workflows.
Code Reference
Source Location
- Repository: pytorch-forecasting
- File: pytorch_forecasting/data/timeseries/_timeseries.py
- Lines: L1641-1682
Signature
@classmethod
def from_dataset(
cls: type[TimeSeriesDataType],
dataset: TimeSeriesDataType,
data: pd.DataFrame,
stop_randomization: bool = False,
predict: bool = False,
**update_kwargs,
) -> TimeSeriesDataType:
"""
Construct dataset with different data, same variable encoders, scalers, etc.
Parameters
----------
dataset : TimeSeriesDataSet
dataset from which to copy parameters
data : pd.DataFrame
data from which new dataset will be generated
stop_randomization : bool, optional, default=False
Whether to stop randomizing encoder and decoder lengths
predict : bool, optional, default=False
Whether to predict on last entries in time index only
**update_kwargs
keyword arguments overrides passed to constructor
Returns
-------
TimeSeriesDataSet
"""
Import
from pytorch_forecasting import TimeSeriesDataSet
# Then call: TimeSeriesDataSet.from_dataset(training, data, ...)
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| dataset | TimeSeriesDataSet | Yes | Source dataset to copy configuration from |
| data | pd.DataFrame | Yes | New data for the dataset (e.g., full DataFrame including validation period) |
| stop_randomization | bool | No | Disable length randomization for validation (default: False) |
| predict | bool | No | Use only last entries per group for prediction (default: False) |
| **update_kwargs | dict | No | Override any constructor parameter |
Outputs
| Name | Type | Description |
|---|---|---|
| return | TimeSeriesDataSet | New dataset with same encoders/scalers but different data |
Usage Examples
Create Validation Dataset
from pytorch_forecasting import TimeSeriesDataSet
# Assume 'training' is an existing TimeSeriesDataSet and 'data' is the full DataFrame
validation = TimeSeriesDataSet.from_dataset(
training,
data,
stop_randomization=True,
predict=False,
)
print(f"Training samples: {len(training)}")
print(f"Validation samples: {len(validation)}")
N-BEATS Validation with Min Prediction Index
# For N-BEATS, restrict validation to predictions after training cutoff
validation = TimeSeriesDataSet.from_dataset(
training,
data,
min_prediction_idx=training_cutoff + 1,
)