Heuristic:Shiyu coder Kronos Instance Normalization Clipping
| Knowledge Sources | |
|---|---|
| Domains | Data_Preprocessing, Time_Series |
| Last Updated | 2026-02-09 13:47 GMT |
Overview
Per-instance z-score normalization with clip value of 5.0 to prevent outlier dominance in financial time series input.
Description
Kronos applies instance-level (per-sample) normalization to all input time series before tokenization. Each sample is independently z-score normalized using its own column-wise mean and standard deviation, with a small epsilon of 1e-5 to prevent division by zero. The normalized values are then clipped to the range [-5.0, +5.0] to suppress extreme outliers. This normalization is applied consistently across inference (`KronosPredictor.predict`), training datasets (`QlibDataset`), and the autoregressive inference loop.
Usage
This heuristic is applied automatically in all Kronos data pipelines. Understand it when:
- Debugging unexpected prediction behavior (check if input data has extreme values)
- Finetuning on custom data (ensure your data follows OHLCV format for proper normalization)
- Interpreting model outputs (predictions are denormalized using the original mean/std)
The Insight (Rule of Thumb)
- Action: Normalize each input sample independently using z-score: `x = (x - mean) / (std + 1e-5)`, then clip to `[-clip, +clip]`.
- Value: `clip=5.0` (default), `epsilon=1e-5`.
- Trade-off: Extreme outliers beyond 5 standard deviations are truncated. This prevents gradient explosion but may lose information about extreme price moves.
- Denormalization: Predictions are mapped back to original scale: `pred = pred * (std + 1e-5) + mean`.
Reasoning
Financial time series exhibit non-stationarity with vastly different price scales across instruments (e.g., penny stocks vs. blue chips) and time periods. Instance normalization ensures the model processes scale-invariant features, allowing a single model to handle diverse assets. The clip value of 5.0 is generous (captures 99.99997% of a normal distribution) but prevents catastrophic outliers from dominating the BSQ tokenizer's codebook, which operates on a fixed range. The 1e-5 epsilon handles zero-variance edge cases (e.g., halted trading days with identical OHLC values).
Evidence from `finetune/config.py:46`:
self.clip = 5.0 # Clipping value for normalized data to prevent outliers.
Normalization in inference from `model/kronos.py:544-547`:
x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.clip, self.clip)
Denormalization of predictions from `model/kronos.py:556`:
preds = preds * (x_std + 1e-5) + x_mean
Same pattern in training dataset from `finetune/dataset.py:122-124`:
x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.config.clip, self.config.clip)
Related Pages
- Implementation:Shiyu_coder_Kronos_KronosPredictor_Predict
- Implementation:Shiyu_coder_Kronos_KronosPredictor_Predict_Batch
- Implementation:Shiyu_coder_Kronos_QlibDataset_Usage
- Implementation:Shiyu_coder_Kronos_CustomKlineDataset_Usage
- Principle:Shiyu_coder_Kronos_Single_Series_Forecasting
- Principle:Shiyu_coder_Kronos_Batch_Forecasting