Implementation:Shiyu coder Kronos SequentialTrainer Usage
| Field | Value |
|---|---|
| Implementation Name | SequentialTrainer_Usage |
| Repository | Shiyu_coder_Kronos |
| Repository URL | https://github.com/shiyu-coder/Kronos |
| Type | API Doc |
| Source File | finetune_csv/train_sequential.py |
| Lines | L18-316 |
| Class | SequentialTrainer |
| Implements Principle | Principle:Shiyu_coder_Kronos_Sequential_Two_Stage_Training |
| Dependencies | torch, torch.distributed (optional DDP), config_loader.CustomFinetuneConfig, finetune_tokenizer.train_tokenizer, finetune_base_model.train_model |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
SequentialTrainer orchestrates the two-phase Kronos finetuning pipeline: first training the tokenizer (VQ-VAE reconstruction), then training the predictor (next-token prediction with frozen tokenizer). It supports skip/resume logic, pretrained or random initialization, and optional DDP distributed training.
API
from train_sequential import SequentialTrainer
# 1. Constructor
trainer = SequentialTrainer(config_path: str = None) -> SequentialTrainer
# 2. Run both phases sequentially
success = trainer.run_training() -> bool
# 3. Run individual phases
success = trainer.train_tokenizer_phase() -> bool
success = trainer.train_basemodel_phase() -> bool
Import
from train_sequential import SequentialTrainer
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| config_path | str | None | Path to YAML config file. Passed to CustomFinetuneConfig. If None, defaults to config.yaml in the same directory.
|
Instance Attributes
| Attribute | Type | Description |
|---|---|---|
| config | CustomFinetuneConfig | Parsed configuration object |
| rank | int | Process rank (from RANK env var, default 0)
|
| world_size | int | Total processes (from WORLD_SIZE env var, default 1)
|
| local_rank | int | Local GPU rank (from LOCAL_RANK env var, default config.device_id)
|
| device | torch.device | Computed device (CUDA or CPU) |
Methods
run_training() -> bool
Orchestrates the full two-phase training pipeline:
- Calls
_setup_distributed()to initialize DDP if applicable - Calls
_create_directories()to ensure output paths exist - Calls
_check_existing_models()to detect pre-existing checkpoints - If
config.train_tokenizeris True, runstrain_tokenizer_phase() - If
config.train_basemodelis True, runstrain_basemodel_phase() - Returns True on success, False on failure
train_tokenizer_phase() -> bool
Executes Phase 1 (tokenizer finetuning):
- Checks if tokenizer model already exists and
skip_existingis True; if so, returns early - Sets up logging
- Loads pretrained tokenizer via
KronosTokenizer.from_pretrained()or randomly initializes from architecture config - Moves tokenizer to device
- Calls
train_tokenizer()(imported fromfinetune_tokenizer) - Saves best model based on validation loss
- Returns True on success
train_basemodel_phase() -> bool
Executes Phase 2 (predictor finetuning):
- Validates that finetuned tokenizer exists (if using pretrained tokenizer)
- Checks if basemodel already exists and
skip_existingis True; if so, returns early - Sets up logging
- Loads finetuned tokenizer (from Phase 1 output) or randomly initializes
- Loads pretrained predictor via
Kronos.from_pretrained()or randomly initializes from architecture config - Moves both models to device
- Calls
train_model()(imported fromfinetune_base_model) - Saves best model based on validation loss
- Returns True on success
CLI Usage
# Standard usage
python train_sequential.py --config path/to/config.yaml
# Skip tokenizer phase (use existing finetuned tokenizer)
python train_sequential.py --config path/to/config.yaml --skip-tokenizer
# Skip basemodel phase (train only tokenizer)
python train_sequential.py --config path/to/config.yaml --skip-basemodel
# Skip training for phases where models already exist on disk
python train_sequential.py --config path/to/config.yaml --skip-existing
CLI Arguments
| Argument | Type | Default | Description |
|---|---|---|---|
--config |
str | 'config.yaml' | Path to YAML configuration file |
--skip-tokenizer |
flag | False | Skip tokenizer training phase |
--skip-basemodel |
flag | False | Skip basemodel training phase |
--skip-existing |
flag | False | Skip training for models that already exist |
Output
- Finetuned tokenizer: Saved to
config.tokenizer_best_model_path(e.g.,.../tokenizer/best_model) - Finetuned predictor: Saved to
config.basemodel_best_model_path(e.g.,.../basemodel/best_model) - Training logs: Written to
config.base_save_path/logs/
Key Implementation Details
Random Initialization Support
When pre_trained_tokenizer or pre_trained_predictor is set to False, the trainer reads the architecture configuration from config.json in the pretrained model directory and constructs a fresh model with random weights:
# Example: random tokenizer initialization
cfg_path = os.path.join(config.pretrained_tokenizer_path, 'config.json')
with open(cfg_path, 'r') as f:
arch = json.load(f)
tokenizer = KronosTokenizer(
d_in=arch.get('d_in', 6),
d_model=arch.get('d_model', 256),
n_heads=arch.get('n_heads', 4),
# ... additional architecture params
)
DDP Distribution
Distributed training is initialized when WORLD_SIZE > 1 and CUDA is available:
# Automatically detected from environment
self.rank = int(os.environ.get("RANK", "0"))
self.world_size = int(os.environ.get("WORLD_SIZE", "1"))
self.local_rank = int(os.environ.get("LOCAL_RANK", "0"))
# DDP initialization in _setup_distributed()
dist.init_process_group(backend="nccl")
Usage Example (Python API)
from train_sequential import SequentialTrainer
# Create trainer with config
trainer = SequentialTrainer("configs/config_ali09988_candle-5min.yaml")
# Override settings programmatically
trainer.config.train_tokenizer = True
trainer.config.train_basemodel = True
trainer.config.skip_existing = False
# Run full pipeline
success = trainer.run_training()
if success:
print(f"Tokenizer saved to: {trainer.config.tokenizer_best_model_path}")
print(f"Predictor saved to: {trainer.config.basemodel_best_model_path}")
See Also
- Principle:Shiyu_coder_Kronos_Sequential_Two_Stage_Training -- The principle this implementation realizes
- Implementation:Shiyu_coder_Kronos_CustomFinetuneConfig_Init -- Configuration class used by SequentialTrainer
- Implementation:Shiyu_coder_Kronos_CustomKlineDataset_Usage -- Dataset class used during training phases