Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Shiyu coder Kronos KronosTokenizer Encode

From Leeroopedia


Field Value
implementation_name KronosTokenizer_Encode
repo Shiyu_coder_Kronos
type API Doc
source_file model/kronos.py:L142-159
class KronosTokenizer
implements Principle:Shiyu_coder_Kronos_Tokenizer_Encoding
last_updated 2026-02-09 14:00 GMT

Summary

The KronosTokenizer.encode method encodes continuous OHLCV financial data into discrete quantized indices through learned Transformer encoder layers and Binary Spherical Quantization. It supports both single combined indices and hierarchical (s1, s2) split indices.

API Signature

KronosTokenizer.encode(
    x: torch.Tensor,
    half: bool = False
) -> torch.Tensor  # or Tuple[torch.Tensor, torch.Tensor] when half=True

Import

from model import KronosTokenizer
# or
from model.kronos import KronosTokenizer

Parameters

Parameter Type Default Description
x torch.Tensor (required) Input tensor of shape (batch_size, seq_len, d_in). Normalized continuous OHLCV data.
half bool False If False, returns a single combined index tensor. If True, returns a tuple of (s1_indices, s2_indices) for hierarchical tokens.

Input

  • x (torch.Tensor): A float tensor of shape (batch_size, seq_len, d_in) containing normalized continuous financial data. The d_in dimension corresponds to the number of input features (typically 6: open, high, low, close, volume, amount).

Output

  • When half=False: A single torch.Tensor of shape (batch_size, seq_len) containing combined codebook indices. Each index is in range [0, 2^(s1_bits + s2_bits)).
  • When half=True: A tuple (s1_indices, s2_indices) where each is a torch.Tensor of shape (batch_size, seq_len). s1_indices are in range [0, 2^s1_bits) and s2_indices are in range [0, 2^s2_bits).

Internal Processing

1. Linear embed:     z = self.embed(x)          # (batch, seq_len, d_model)
2. Encoder layers:   z = encoder_block(z)        # repeated (n_enc_layers - 1) times
3. Quant embed:      z = self.quant_embed(z)     # (batch, seq_len, codebook_dim)
4. BSQuantizer:      _, _, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
5. Return:           z_indices

Example

import torch
from model import KronosTokenizer

tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
tokenizer.eval()
tokenizer = tokenizer.to("cuda:0")

# Create sample input: batch of 2 series, 100 timesteps, 6 features
x = torch.randn(2, 100, 6).to("cuda:0")

# Get combined indices
indices = tokenizer.encode(x, half=False)
print(indices.shape)  # torch.Size([2, 100])

# Get hierarchical (s1, s2) indices
s1_indices, s2_indices = tokenizer.encode(x, half=True)
print(s1_indices.shape)   # torch.Size([2, 100])
print(s2_indices.shape)   # torch.Size([2, 100])

Source Code Reference

File: model/kronos.py, lines 142-159.

def encode(self, x, half=False):
    """
    Encodes the input data into quantized indices.

    Args:
        x (torch.Tensor): Input tensor of shape (batch_size, seq_len, d_in).
        half (bool, optional): Whether to use half quantization in BSQuantizer. Defaults to False.

    Returns:
        torch.Tensor: Quantized indices from BSQuantizer.
    """
    z = self.embed(x)
    for layer in self.encoder:
        z = layer(z)
    z = self.quant_embed(z)

    bsq_loss, quantized, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
    return z_indices

Usage Context

  • half=True is the standard mode used during the autoregressive inference pipeline (in auto_regressive_inference()) because the Kronos model's DualHead predicts s1 and s2 tokens separately.
  • half=False returns a single flat index and is useful for evaluation, visualization, or when hierarchical decomposition is not needed.
  • The collect_metrics=False flag tells the BSQuantizer to skip metric collection during encoding, which is appropriate for inference.

Notes

  • The method does not handle normalization. Input x should already be normalized (zero mean, unit variance, clipped) before calling encode().
  • The encoding is performed in the current model's device context. Ensure x is on the same device as the tokenizer.
  • The BSQ loss value returned by the quantizer is discarded during encoding since it is only needed during training.
  • The encoder uses n_enc_layers - 1 Transformer blocks (one fewer than the configured layer count, as the linear embedding acts as the first layer).

Environment & Heuristic Links

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment