Implementation:Sktime Pytorch forecasting Optimize Hyperparameters
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Hyperparameter_Tuning, AutoML |
| Last Updated | 2026-02-08 07:00 GMT |
Overview
Concrete tool for automated TFT hyperparameter optimization using Optuna provided by the pytorch-forecasting library.
Description
The optimize_hyperparameters function orchestrates a full Optuna-based hyperparameter search for the Temporal Fusion Transformer. It creates an Optuna study, defines an internal objective function that: (1) samples hyperparameters from configured ranges, (2) creates a TFT model, (3) optionally runs LR finding via Tuner.lr_find, (4) trains with Trainer.fit for up to max_epochs, and (5) returns the best validation loss. The function supports resuming existing studies, custom pruners, and timeout-based stopping. Default search ranges cover gradient_clip_val, hidden_size, hidden_continuous_size, attention_head_size, dropout, and learning_rate.
Usage
Call after constructing training and validation DataLoaders. The function requires pre-built DataLoaders (not datasets). Pass hyperparameter ranges to control the search space. The returned Optuna Study can be queried for best parameters with study.best_trial.params. This function is specific to the Temporal Fusion Transformer.
Code Reference
Source Location
- Repository: pytorch-forecasting
- File: pytorch_forecasting/models/temporal_fusion_transformer/tuning.py
- Lines: L46-257
Signature
def optimize_hyperparameters(
train_dataloaders: DataLoader,
val_dataloaders: DataLoader,
model_path: str,
max_epochs: int = 20,
n_trials: int = 100,
timeout: float = 3600 * 8.0,
gradient_clip_val_range: tuple[float, float] = (0.01, 100.0),
hidden_size_range: tuple[int, int] = (16, 265),
hidden_continuous_size_range: tuple[int, int] = (8, 64),
attention_head_size_range: tuple[int, int] = (1, 4),
dropout_range: tuple[float, float] = (0.1, 0.3),
learning_rate_range: tuple[float, float] = (1e-5, 1.0),
use_learning_rate_finder: bool = True,
trainer_kwargs: dict[str, Any] = {},
log_dir: str = "lightning_logs",
study=None,
verbose: int | bool = None,
pruner=None,
**kwargs,
) -> "optuna.Study":
"""
Optimize Temporal Fusion Transformer hyperparameters.
Args:
train_dataloaders: training DataLoader
val_dataloaders: validation DataLoader
model_path: folder for model checkpoints
max_epochs: max epochs per trial (default: 20)
n_trials: number of Optuna trials (default: 100)
timeout: max seconds (default: 28800 = 8h)
use_learning_rate_finder: use LR finder per trial (default: True)
study: existing Optuna study to resume
pruner: Optuna pruner (default: MedianPruner)
**kwargs: passed to TemporalFusionTransformer constructor
Returns:
optuna.Study with all trial results
"""
Import
from pytorch_forecasting.models.temporal_fusion_transformer.tuning import optimize_hyperparameters
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| train_dataloaders | DataLoader | Yes | Training DataLoader |
| val_dataloaders | DataLoader | Yes | Validation DataLoader |
| model_path | str | Yes | Directory for saving trial checkpoints |
| n_trials | int | No | Number of optimization trials (default: 100) |
| timeout | float | No | Maximum search time in seconds (default: 28800) |
| max_epochs | int | No | Maximum epochs per trial (default: 20) |
| hidden_size_range | tuple[int, int] | No | Search range for hidden_size (default: (16, 265)) |
| dropout_range | tuple[float, float] | No | Search range for dropout (default: (0.1, 0.3)) |
| learning_rate_range | tuple[float, float] | No | Search range for LR (default: (1e-5, 1.0)) |
| use_learning_rate_finder | bool | No | Run LR finder per trial (default: True) |
Outputs
| Name | Type | Description |
|---|---|---|
| return | optuna.Study | Completed study with .best_trial.params containing best hyperparameters |
Usage Examples
Run Hyperparameter Optimization
from pytorch_forecasting.models.temporal_fusion_transformer.tuning import (
optimize_hyperparameters,
)
# Run optimization
study = optimize_hyperparameters(
train_dataloaders=train_dataloader,
val_dataloaders=val_dataloader,
model_path="optuna_test",
n_trials=200,
max_epochs=50,
gradient_clip_val_range=(0.01, 1.0),
hidden_size_range=(8, 128),
hidden_continuous_size_range=(8, 128),
attention_head_size_range=(1, 4),
learning_rate_range=(0.001, 0.1),
dropout_range=(0.1, 0.3),
trainer_kwargs=dict(limit_train_batches=30),
reduce_on_plateau_patience=4,
)
# Get best hyperparameters
print(f"Best trial: {study.best_trial.number}")
print(f"Best val loss: {study.best_trial.value:.4f}")
print(f"Best params: {study.best_trial.params}")
# Create final model with best parameters
best_tft = TemporalFusionTransformer.from_dataset(
training,
**study.best_trial.params,
)
Related Pages
Implements Principle
Requires Environment
- Environment:Sktime_Pytorch_forecasting_Core_Python_Dependencies
- Environment:Sktime_Pytorch_forecasting_Optuna_Tuning_Dependencies