Principle:Sktime Pytorch forecasting Baseline Forecasting

Knowledge Sources	pytorch-forecasting
Domains	Time_Series, Forecasting, Deep_Learning
Last Updated	2026-02-08 09:00 GMT

Overview

A naive last-value-repeat forecasting strategy that serves as a zero-learnable-parameter performance benchmark, repeating the most recent observed target value across the entire prediction horizon.

Description

The Baseline model implements the simplest possible forecasting heuristic: it takes the last known value from the encoder (history) window and repeats it unchanged for every step in the decoder (prediction) window. This is sometimes called a naive persistence forecast or random walk forecast without drift.

The model contains no trainable parameters and requires no fitting. Its forward pass simply indexes the encoder target tensor at the position corresponding to each sample's encoder length (i.e., the final observed value) and expands that scalar across the maximum prediction length. It handles both single-target and multi-target settings by iterating over each target independently.

Because it has no parameters to train, the Baseline model is used exclusively for evaluation: it provides a lower bound on acceptable forecast accuracy. Any learned model that cannot outperform the Baseline on a given dataset is adding no value beyond the trivial persistence heuristic.

The model inherits from BaseModel and overrides to_prediction and to_quantiles to return the raw repeated value (with an extra trailing dimension for the quantile interface).

Usage

Use the Baseline model as a sanity check and performance floor when evaluating any forecasting model on a new dataset. It is appropriate for: (1) establishing a benchmark MAE, SMAPE, or other metric before training complex models, (2) quickly verifying that the data pipeline produces sensible outputs, and (3) comparing learned models against the simplest possible alternative. If a trained model cannot beat the Baseline, this indicates a problem with model configuration, data quality, or the forecasting task itself.

Theoretical Basis

Last-value-repeat forecast:

Given an encoder target sequence $x_{1}, x_{2}, \dots, x_{T}$ where $T$ is the encoder length for a given sample, the prediction for all future horizons $h = 1, \dots, H$ is:

${\hat{y}}_{T + h} = x_{T}, \forall h \in {1, \dots, H}$

Multi-target extension:

When multiple targets exist, the procedure is applied independently to each target variable:

${\hat{y}}_{T + h}^{(k)} = x_{T}^{(k)}, \forall k, h$

Variable encoder lengths:

In a batch, each sample may have a different encoder length. The implementation uses per-sample indexing to select the correct last value:

for each sample i in batch:
    last_value = encoder_target[i, encoder_length[i] - 1]
    prediction[i, :] = last_value   # repeat for all H steps

Properties:

Zero trainable parameters
Optimal under a random walk assumption (i.e., $x_{t + 1} = x_{t} + ϵ$ )
Quantile output is trivially the point prediction (no uncertainty estimation)

Related Pages

Implemented By

Implementation:Sktime_Pytorch_forecasting_Baseline

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment