Workflow:Shiyu coder Kronos Batch Prediction

Knowledge Sources	Kronos HuggingFace Models Kronos Paper
Domains	Financial_Forecasting, Time_Series, LLMs
Last Updated	2026-02-09 14:00 GMT

Overview

End-to-end process for generating candlestick forecasts on multiple financial time series simultaneously using GPU-parallel batch inference with the Kronos foundation model.

Description

This workflow extends the single-series prediction pipeline to handle multiple time series in a single batched forward pass. It loads the pre-trained KronosTokenizer and Kronos model, wraps them in a KronosPredictor, prepares lists of DataFrames and timestamp Series for multiple instruments, and calls the predict_batch method. The batch method stacks all series into a single tensor, runs autoregressive generation in parallel across the batch dimension, and returns per-series denormalized prediction DataFrames. Each series is independently normalized and denormalized, but tokenization and generation share GPU compute.

Usage

Execute this workflow when you need to forecast multiple financial instruments (or multiple time windows of the same instrument) simultaneously and want to leverage GPU parallelism. All series must have the same historical length (lookback) and the same prediction length (pred_len). This is suitable for portfolio-level forecasting, screening multiple assets, or generating predictions across multiple rolling windows.

Execution Steps

Step 1: Load Tokenizer and Model

Load a pre-trained KronosTokenizer and Kronos model from the HuggingFace Hub. This step is identical to the single-series workflow. The tokenizer and model are loaded once and reused across all series in the batch.

Key considerations:

For batch prediction, a GPU device is strongly recommended for performance
The tokenizer and model must be a matched pair

Step 2: Instantiate Predictor

Create a KronosPredictor with the loaded model, tokenizer, and target device. For batch prediction, explicitly specifying a CUDA device is recommended for efficient parallelism.

Key considerations:

Set max_context to match the model variant (512 for small/base)
GPU memory requirements scale linearly with batch size and sample_count

Step 3: Prepare Batch Inputs

Construct three parallel lists: a list of DataFrames (one per series), a list of historical timestamp Series, and a list of future timestamp Series. Each DataFrame must contain at least open, high, low, close columns. All DataFrames must have identical row counts (same lookback), and all future timestamp Series must have length equal to pred_len.

Key considerations:

All series must share the same historical length and prediction length
Each series is independently normalized (instance-level mean/std)
Volume and amount columns are optional; missing values are zero-filled per series
The method validates input consistency and raises clear errors on mismatches

Step 4: Generate Batch Forecast

Call predict_batch with the lists of inputs and sampling parameters. Internally, the method stacks all normalized series into a single batch tensor, runs autoregressive generation for the entire batch in parallel, and splits the results back into per-series predictions. Each series is denormalized with its own statistics.

Key considerations:

GPU memory usage is proportional to batch_size multiplied by sample_count
The method returns a list of DataFrames in the same order as input
Temperature, top_p, and sample_count parameters apply uniformly to all series
Progress bar (verbose=True) tracks autoregressive token generation across the batch

Step 5: Process Results

Iterate through the returned list of prediction DataFrames to analyze, save, or visualize results for each individual series. Each DataFrame contains open, high, low, close, volume, amount columns indexed by the corresponding y_timestamp.

Key considerations:

Results maintain the same ordering as the input lists
Per-series post-processing (e.g., price limit clamping) can be applied independently
For large batches, consider processing results in chunks to manage memory

Execution Diagram

GitHub URL

Workflow Repository