Implementation:Shiyu coder Kronos KronosTokenizer Encode

Field	Value
implementation_name	KronosTokenizer_Encode
repo	Shiyu_coder_Kronos
type	API Doc
source_file	model/kronos.py:L142-159
class	KronosTokenizer
implements	Principle:Shiyu_coder_Kronos_Tokenizer_Encoding
last_updated	2026-02-09 14:00 GMT

Summary

The KronosTokenizer.encode method encodes continuous OHLCV financial data into discrete quantized indices through learned Transformer encoder layers and Binary Spherical Quantization. It supports both single combined indices and hierarchical (s1, s2) split indices.

API Signature

KronosTokenizer.encode(
    x: torch.Tensor,
    half: bool = False
) -> torch.Tensor  # or Tuple[torch.Tensor, torch.Tensor] when half=True

Import

from model import KronosTokenizer
# or
from model.kronos import KronosTokenizer

Parameters

Parameter	Type	Default	Description
x	torch.Tensor	(required)	Input tensor of shape `(batch_size, seq_len, d_in)`. Normalized continuous OHLCV data.
half	bool	False	If `False`, returns a single combined index tensor. If `True`, returns a tuple of `(s1_indices, s2_indices)` for hierarchical tokens.

Input

x (torch.Tensor): A float tensor of shape (batch_size, seq_len, d_in) containing normalized continuous financial data. The d_in dimension corresponds to the number of input features (typically 6: open, high, low, close, volume, amount).

Output

When half=False: A single torch.Tensor of shape (batch_size, seq_len) containing combined codebook indices. Each index is in range [0, 2^(s1_bits + s2_bits)).
When half=True: A tuple (s1_indices, s2_indices) where each is a torch.Tensor of shape (batch_size, seq_len). s1_indices are in range [0, 2^s1_bits) and s2_indices are in range [0, 2^s2_bits).

Internal Processing

1. Linear embed:     z = self.embed(x)          # (batch, seq_len, d_model)
2. Encoder layers:   z = encoder_block(z)        # repeated (n_enc_layers - 1) times
3. Quant embed:      z = self.quant_embed(z)     # (batch, seq_len, codebook_dim)
4. BSQuantizer:      _, _, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
5. Return:           z_indices

Example

import torch
from model import KronosTokenizer

tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
tokenizer.eval()
tokenizer = tokenizer.to("cuda:0")

# Create sample input: batch of 2 series, 100 timesteps, 6 features
x = torch.randn(2, 100, 6).to("cuda:0")

# Get combined indices
indices = tokenizer.encode(x, half=False)
print(indices.shape)  # torch.Size([2, 100])

# Get hierarchical (s1, s2) indices
s1_indices, s2_indices = tokenizer.encode(x, half=True)
print(s1_indices.shape)   # torch.Size([2, 100])
print(s2_indices.shape)   # torch.Size([2, 100])

Source Code Reference

File: model/kronos.py, lines 142-159.

def encode(self, x, half=False):
    """
    Encodes the input data into quantized indices.

    Args:
        x (torch.Tensor): Input tensor of shape (batch_size, seq_len, d_in).
        half (bool, optional): Whether to use half quantization in BSQuantizer. Defaults to False.

    Returns:
        torch.Tensor: Quantized indices from BSQuantizer.
    """
    z = self.embed(x)
    for layer in self.encoder:
        z = layer(z)
    z = self.quant_embed(z)

    bsq_loss, quantized, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
    return z_indices

Usage Context

half=True is the standard mode used during the autoregressive inference pipeline (in auto_regressive_inference()) because the Kronos model's DualHead predicts s1 and s2 tokens separately.
half=False returns a single flat index and is useful for evaluation, visualization, or when hierarchical decomposition is not needed.
The collect_metrics=False flag tells the BSQuantizer to skip metric collection during encoding, which is appropriate for inference.

Notes

The method does not handle normalization. Input x should already be normalized (zero mean, unit variance, clipped) before calling encode().
The encoding is performed in the current model's device context. Ensure x is on the same device as the tokenizer.
The BSQ loss value returned by the quantizer is discarded during encoding since it is only needed during training.
The encoder uses n_enc_layers - 1 Transformer blocks (one fewer than the configured layer count, as the linear embedding acts as the first layer).

Environment & Heuristic Links

Environment:Shiyu_coder_Kronos_PyTorch_CUDA_Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment