Principle:Shiyu coder Kronos Candlestick Data Preparation
| Field | Value |
|---|---|
| Principle Name | Candlestick_Data_Preparation |
| Repository | Shiyu_coder_Kronos |
| Repository URL | https://github.com/shiyu-coder/Kronos |
| Domains | Data_Preparation, Financial_Data, Time_Series |
| Implemented By | Implementation:Shiyu_coder_Kronos_Candlestick_Data_Preparation_Pattern |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
This principle describes how to load and structure OHLCV financial time series data from CSV files into the format required by KronosPredictor, including the construction of separate DataFrames and timestamp Series for the historical input window and the future prediction horizon.
Concept
The KronosPredictor requires three distinct inputs for generating forecasts:
- x_df: A DataFrame of historical OHLCV + amount features (the lookback window)
- x_timestamp: A datetime Series corresponding to the historical data rows
- y_timestamp: A datetime Series for the future prediction horizon
These must be extracted from a CSV file containing candlestick (OHLCV) data with timestamps.
Theory
The data preparation follows a specific sequence:
1. Load CSV and Parse Timestamps
Read the CSV file and convert the timestamps column to pandas datetime objects. This ensures proper temporal ordering and enables datetime-based indexing.
2. Define Window Parameters
Two key parameters control the data slicing:
- lookback: The number of historical time steps to provide as input context (e.g., 400)
- pred_len: The number of future time steps to predict (e.g., 120)
3. Slice Input Data
From the full DataFrame, extract three components:
- x_df: A DataFrame containing the lookback window of OHLCV + amount columns. This is the historical data the model uses as context.
- x_timestamp: The corresponding datetime values for the historical window.
- y_timestamp: The datetime values for the prediction horizon (the future time steps the model will forecast).
4. Column Requirements
The six required feature columns are:
open-- Opening pricehigh-- Highest pricelow-- Lowest priceclose-- Closing pricevolume-- Trading volumeamount-- Trading amount
5. Temporal Alignment
The y_timestamp Series provides the model with information about the temporal positions of the prediction targets. This enables the model to generate time-aware predictions that respect market calendar patterns (e.g., trading hours, weekdays).
Relationship to KronosPredictor
The prepared data is passed directly to the KronosPredictor.predict() method:
pred_df = predictor.predict(
df=x_df,
x_timestamp=x_timestamp,
y_timestamp=y_timestamp,
pred_len=pred_len,
T=1.0,
top_p=0.9,
sample_count=1,
verbose=True
)
The predictor handles internal normalization, tokenization, and autoregressive generation.
See Also
- Implementation:Shiyu_coder_Kronos_Candlestick_Data_Preparation_Pattern -- The canonical code pattern implementing this principle
- Principle:Shiyu_coder_Kronos_Prediction_Visualization -- Visualizing predictions after data preparation and inference