Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Shiyu coder Kronos Tokenizer Encoding

From Leeroopedia


Field Value
principle_name Tokenizer_Encoding
repo Shiyu_coder_Kronos
domains Quantization, Tokenization, Time_Series
last_updated 2026-02-09 14:00 GMT
implemented_by Implementation:Shiyu_coder_Kronos_KronosTokenizer_Encode

Summary

Encoding continuous OHLCV financial data into hierarchical discrete token indices through learned encoder Transformer layers and Binary Spherical Quantization.

Concept

The encoding process transforms continuous multivariate financial time series data into discrete token indices that can be consumed by the autoregressive Kronos Transformer. This discretization is the fundamental bridge between continuous price data and the discrete sequence modeling paradigm.

The encoder operates on normalized OHLCV features (open, high, low, close, volume, amount) and produces either:

  • A single combined index per timestep (when half=False), representing the full codebook entry.
  • A pair of hierarchical indices (s1, s2) per timestep (when half=True), representing coarse and fine quantization levels separately.

Theory

The encoding pipeline follows four stages:

Input x: (batch, seq_len, d_in)
    |
    v
Linear Embedding: nn.Linear(d_in -> d_model)
    |
    v
Encoder Transformer Blocks: (n_enc_layers - 1) TransformerBlock layers
    |
    v
Quantization Embedding: nn.Linear(d_model -> codebook_dim)
    where codebook_dim = s1_bits + s2_bits
    |
    v
BSQuantizer: Binary Spherical Quantization
    |
    v
Output: z_indices (discrete token indices)

Binary Spherical Quantization (BSQ)

The BSQuantizer converts continuous vectors into binary codes on a hypersphere:

  1. The input vector (of dimension codebook_dim = s1_bits + s2_bits) is quantized to binary values {-1, +1}.
  2. Each bit position contributes to a binary code that indexes into an implicit codebook.
  3. The binary code is scaled by 1 / sqrt(codebook_dim) to project onto the unit hypersphere surface.

Hierarchical Indices (half=True)

When half=True, the BSQuantizer splits the codebook dimension in half:

  • s1_indices: Index computed from the first s1_bits dimensions. Represents the coarse quantization.
  • s2_indices: Index computed from the last s2_bits dimensions. Represents the fine quantization.

The s1 vocabulary size is 2^s1_bits and the s2 vocabulary size is 2^s2_bits.

Combined Index (half=False)

When half=False, a single index is computed from all s1_bits + s2_bits dimensions, giving a vocabulary size of 2^(s1_bits + s2_bits).

Training vs Inference Use

  • half=True is used during the autoregressive pipeline (both training and inference) because the Kronos model predicts s1 and s2 separately via the DualHead.
  • half=False is used for evaluation or when a single flat codebook index is sufficient.

Source

  • Repository: Kronos on GitHub
  • Binary Spherical Quantization for learned discrete representations.

Domains

  • Quantization: Binary spherical codebook for discrete representation.
  • Tokenization: Converting continuous signals to discrete tokens.
  • Time_Series: Applied to multivariate financial time series data.

Related Principles

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment