Implementation:Shiyu coder Kronos KronosTokenizer Encode
Appearance
| Field | Value |
|---|---|
| implementation_name | KronosTokenizer_Encode |
| repo | Shiyu_coder_Kronos |
| type | API Doc |
| source_file | model/kronos.py:L142-159 |
| class | KronosTokenizer |
| implements | Principle:Shiyu_coder_Kronos_Tokenizer_Encoding |
| last_updated | 2026-02-09 14:00 GMT |
Summary
The KronosTokenizer.encode method encodes continuous OHLCV financial data into discrete quantized indices through learned Transformer encoder layers and Binary Spherical Quantization. It supports both single combined indices and hierarchical (s1, s2) split indices.
API Signature
KronosTokenizer.encode(
x: torch.Tensor,
half: bool = False
) -> torch.Tensor # or Tuple[torch.Tensor, torch.Tensor] when half=True
Import
from model import KronosTokenizer
# or
from model.kronos import KronosTokenizer
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| x | torch.Tensor | (required) | Input tensor of shape (batch_size, seq_len, d_in). Normalized continuous OHLCV data.
|
| half | bool | False | If False, returns a single combined index tensor. If True, returns a tuple of (s1_indices, s2_indices) for hierarchical tokens.
|
Input
- x (torch.Tensor): A float tensor of shape
(batch_size, seq_len, d_in)containing normalized continuous financial data. Thed_indimension corresponds to the number of input features (typically 6: open, high, low, close, volume, amount).
Output
- When half=False: A single
torch.Tensorof shape(batch_size, seq_len)containing combined codebook indices. Each index is in range[0, 2^(s1_bits + s2_bits)). - When half=True: A tuple
(s1_indices, s2_indices)where each is atorch.Tensorof shape(batch_size, seq_len).s1_indicesare in range[0, 2^s1_bits)ands2_indicesare in range[0, 2^s2_bits).
Internal Processing
1. Linear embed: z = self.embed(x) # (batch, seq_len, d_model)
2. Encoder layers: z = encoder_block(z) # repeated (n_enc_layers - 1) times
3. Quant embed: z = self.quant_embed(z) # (batch, seq_len, codebook_dim)
4. BSQuantizer: _, _, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
5. Return: z_indices
Example
import torch
from model import KronosTokenizer
tokenizer = KronosTokenizer.from_pretrained("NeoQuasar/Kronos-Tokenizer-base")
tokenizer.eval()
tokenizer = tokenizer.to("cuda:0")
# Create sample input: batch of 2 series, 100 timesteps, 6 features
x = torch.randn(2, 100, 6).to("cuda:0")
# Get combined indices
indices = tokenizer.encode(x, half=False)
print(indices.shape) # torch.Size([2, 100])
# Get hierarchical (s1, s2) indices
s1_indices, s2_indices = tokenizer.encode(x, half=True)
print(s1_indices.shape) # torch.Size([2, 100])
print(s2_indices.shape) # torch.Size([2, 100])
Source Code Reference
File: model/kronos.py, lines 142-159.
def encode(self, x, half=False):
"""
Encodes the input data into quantized indices.
Args:
x (torch.Tensor): Input tensor of shape (batch_size, seq_len, d_in).
half (bool, optional): Whether to use half quantization in BSQuantizer. Defaults to False.
Returns:
torch.Tensor: Quantized indices from BSQuantizer.
"""
z = self.embed(x)
for layer in self.encoder:
z = layer(z)
z = self.quant_embed(z)
bsq_loss, quantized, z_indices = self.tokenizer(z, half=half, collect_metrics=False)
return z_indices
Usage Context
- half=True is the standard mode used during the autoregressive inference pipeline (in
auto_regressive_inference()) because the Kronos model's DualHead predicts s1 and s2 tokens separately. - half=False returns a single flat index and is useful for evaluation, visualization, or when hierarchical decomposition is not needed.
- The
collect_metrics=Falseflag tells the BSQuantizer to skip metric collection during encoding, which is appropriate for inference.
Notes
- The method does not handle normalization. Input
xshould already be normalized (zero mean, unit variance, clipped) before callingencode(). - The encoding is performed in the current model's device context. Ensure
xis on the same device as the tokenizer. - The BSQ loss value returned by the quantizer is discarded during encoding since it is only needed during training.
- The encoder uses
n_enc_layers - 1Transformer blocks (one fewer than the configured layer count, as the linear embedding acts as the first layer).
Environment & Heuristic Links
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment