Heuristic:Shiyu coder Kronos Instance Normalization Clipping

Knowledge Sources	Kronos Internal
Domains	Data_Preprocessing, Time_Series
Last Updated	2026-02-09 13:47 GMT

Overview

Per-instance z-score normalization with clip value of 5.0 to prevent outlier dominance in financial time series input.

Description

Kronos applies instance-level (per-sample) normalization to all input time series before tokenization. Each sample is independently z-score normalized using its own column-wise mean and standard deviation, with a small epsilon of 1e-5 to prevent division by zero. The normalized values are then clipped to the range [-5.0, +5.0] to suppress extreme outliers. This normalization is applied consistently across inference (`KronosPredictor.predict`), training datasets (`QlibDataset`), and the autoregressive inference loop.

Usage

This heuristic is applied automatically in all Kronos data pipelines. Understand it when:

Debugging unexpected prediction behavior (check if input data has extreme values)
Finetuning on custom data (ensure your data follows OHLCV format for proper normalization)
Interpreting model outputs (predictions are denormalized using the original mean/std)

The Insight (Rule of Thumb)

Action: Normalize each input sample independently using z-score: `x = (x - mean) / (std + 1e-5)`, then clip to `[-clip, +clip]`.
Value: `clip=5.0` (default), `epsilon=1e-5`.
Trade-off: Extreme outliers beyond 5 standard deviations are truncated. This prevents gradient explosion but may lose information about extreme price moves.
Denormalization: Predictions are mapped back to original scale: `pred = pred * (std + 1e-5) + mean`.

Reasoning

Financial time series exhibit non-stationarity with vastly different price scales across instruments (e.g., penny stocks vs. blue chips) and time periods. Instance normalization ensures the model processes scale-invariant features, allowing a single model to handle diverse assets. The clip value of 5.0 is generous (captures 99.99997% of a normal distribution) but prevents catastrophic outliers from dominating the BSQ tokenizer's codebook, which operates on a fixed range. The 1e-5 epsilon handles zero-variance edge cases (e.g., halted trading days with identical OHLC values).

Evidence from `finetune/config.py:46`:

self.clip = 5.0  # Clipping value for normalized data to prevent outliers.

Normalization in inference from `model/kronos.py:544-547`:

x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.clip, self.clip)

Denormalization of predictions from `model/kronos.py:556`:

preds = preds * (x_std + 1e-5) + x_mean

Same pattern in training dataset from `finetune/dataset.py:122-124`:

x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.config.clip, self.config.clip)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment