Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Shiyu coder Kronos Instance Normalization Clipping

From Leeroopedia



Knowledge Sources
Domains Data_Preprocessing, Time_Series
Last Updated 2026-02-09 13:47 GMT

Overview

Per-instance z-score normalization with clip value of 5.0 to prevent outlier dominance in financial time series input.

Description

Kronos applies instance-level (per-sample) normalization to all input time series before tokenization. Each sample is independently z-score normalized using its own column-wise mean and standard deviation, with a small epsilon of 1e-5 to prevent division by zero. The normalized values are then clipped to the range [-5.0, +5.0] to suppress extreme outliers. This normalization is applied consistently across inference (`KronosPredictor.predict`), training datasets (`QlibDataset`), and the autoregressive inference loop.

Usage

This heuristic is applied automatically in all Kronos data pipelines. Understand it when:

  • Debugging unexpected prediction behavior (check if input data has extreme values)
  • Finetuning on custom data (ensure your data follows OHLCV format for proper normalization)
  • Interpreting model outputs (predictions are denormalized using the original mean/std)

The Insight (Rule of Thumb)

  • Action: Normalize each input sample independently using z-score: `x = (x - mean) / (std + 1e-5)`, then clip to `[-clip, +clip]`.
  • Value: `clip=5.0` (default), `epsilon=1e-5`.
  • Trade-off: Extreme outliers beyond 5 standard deviations are truncated. This prevents gradient explosion but may lose information about extreme price moves.
  • Denormalization: Predictions are mapped back to original scale: `pred = pred * (std + 1e-5) + mean`.

Reasoning

Financial time series exhibit non-stationarity with vastly different price scales across instruments (e.g., penny stocks vs. blue chips) and time periods. Instance normalization ensures the model processes scale-invariant features, allowing a single model to handle diverse assets. The clip value of 5.0 is generous (captures 99.99997% of a normal distribution) but prevents catastrophic outliers from dominating the BSQ tokenizer's codebook, which operates on a fixed range. The 1e-5 epsilon handles zero-variance edge cases (e.g., halted trading days with identical OHLC values).

Evidence from `finetune/config.py:46`:

self.clip = 5.0  # Clipping value for normalized data to prevent outliers.

Normalization in inference from `model/kronos.py:544-547`:

x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.clip, self.clip)

Denormalization of predictions from `model/kronos.py:556`:

preds = preds * (x_std + 1e-5) + x_mean

Same pattern in training dataset from `finetune/dataset.py:122-124`:

x_mean, x_std = np.mean(x, axis=0), np.std(x, axis=0)
x = (x - x_mean) / (x_std + 1e-5)
x = np.clip(x, -self.config.clip, self.config.clip)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment