Principle:Shiyu coder Kronos Qlib Test Inference

Field	Value
principle_name	Qlib_Test_Inference
repository	https://github.com/shiyu-coder/Kronos
domains	Inference, Batch_Processing, Financial_Forecasting
implemented_by	Implementation:Shiyu_coder_Kronos_Generate_Predictions_Qlib
last_updated	2026-02-09 14:00 GMT

Summary

Batch inference over an entire test dataset to generate trading signal predictions for backtesting evaluation.

Concept

The Qlib Test Inference principle describes the end-to-end process of converting fine-tuned models into actionable trading signals. This is the bridge between model training and backtesting evaluation, where the model produces quantitative predictions for every symbol on every day in the test period.

Theory

The inference pipeline follows a structured sequence:

Model Loading

Both the fine-tuned tokenizer and predictor are loaded and set to eval mode. The tokenizer converts continuous inputs to discrete tokens, and the predictor generates future token predictions autoregressively.

Sequential Sliding Window Dataset

Unlike training (which uses random sampling), the test dataset (QlibTestDataset) iterates sequentially through all valid sliding windows. For each window, it yields:

Context features (lookback window)
Context time stamps
Future time stamps (for the prediction horizon)
Symbol name and timestamp metadata (for mapping predictions back to the calendar)

The metadata is critical because predictions must be associated with specific (datetime, symbol) pairs for portfolio construction.

Autoregressive Inference

For each batch, the pipeline uses auto_regressive_inference() to generate predictions:

The tokenizer encodes the context window into discrete tokens
The predictor generates future tokens autoregressively, one step at a time
Multiple samples are drawn (controlled by inference_sample_count) and averaged to reduce variance
Temperature (inference_T) and nucleus sampling (inference_top_p) control generation diversity

Signal Extraction

From the raw multi-feature predictions, trading signals are extracted by computing the close price delta (predicted close minus last observed close). Four signal variants are computed:

mean: Average predicted close across the prediction horizon
last: Predicted close on the final prediction day
max: Maximum predicted close across the horizon
min: Minimum predicted close across the horizon

Each variant captures a different aspect of the price trajectory and may perform differently in backtesting.

Pivot to DataFrames

The per-sample (timestamp, symbol, score) records are pivoted into DataFrames with:

Index: datetime
Columns: symbol names
Values: prediction scores

This format is directly consumable by the backtesting framework.

Key Design Decisions

Custom collate function: Required because each batch contains a mix of tensors (features), strings (symbols), and Timestamp objects that cannot be handled by PyTorch's default collation
Batch size adjustment: The DataLoader batch size is divided by sample_count because auto_regressive_inference internally expands each sample by the sample count
Instance normalization: Applied identically to training to ensure consistency

Domains

Inference: Batch model prediction generation
Batch_Processing: Efficient DataLoader-based processing of large test sets
Financial_Forecasting: Time series price prediction for trading signals

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment