Implementation:Shiyu coder Kronos CustomFinetuneConfig Init
Appearance
| Field | Value |
|---|---|
| Implementation Name | CustomFinetuneConfig_Init |
| Repository | Shiyu_coder_Kronos |
| Repository URL | https://github.com/shiyu-coder/Kronos |
| Type | API Doc |
| Source File | finetune_csv/config_loader.py |
| Lines | L109-267 (CustomFinetuneConfig), L6-106 (ConfigLoader) |
| Class | CustomFinetuneConfig |
| Implements Principle | Principle:Shiyu_coder_Kronos_CSV_Finetuning_Configuration |
| Dependencies | yaml, os |
| Last Updated | 2026-02-09 14:00 GMT |
Overview
CustomFinetuneConfig is the primary configuration class for Kronos CSV finetuning. It loads a YAML configuration file via the internal ConfigLoader helper, resolves dynamic paths, and exposes all configuration values as typed Python attributes.
API
from config_loader import CustomFinetuneConfig
config = CustomFinetuneConfig(config_path: str = None) -> CustomFinetuneConfig
Import
from config_loader import CustomFinetuneConfig
Constructor Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| config_path | str | None | Path to YAML config file. If None, defaults to config.yaml in the same directory as config_loader.py.
|
Output Attributes
Data Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
| data_path | str | -- | Path to the CSV data file |
| lookback_window | int | 512 | Number of historical time steps for input context |
| predict_window | int | 48 | Number of future time steps to predict |
| max_context | int | 512 | Maximum context length for the model |
| clip | float | 5.0 | Clipping threshold for normalized data |
| train_ratio | float | 0.9 | Fraction of data for training |
| val_ratio | float | 0.1 | Fraction of data for validation |
| test_ratio | float | 0.0 | Fraction of data for testing |
Training Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
| tokenizer_epochs | int | 30 | Training epochs for the tokenizer phase |
| basemodel_epochs | int | 30 | Training epochs for the predictor phase |
| batch_size | int | 160 | Batch size for DataLoader |
| tokenizer_learning_rate | float | 2e-4 | Learning rate for tokenizer training |
| predictor_learning_rate | float | 4e-5 | Learning rate for predictor training |
| accumulation_steps | int | 1 | Gradient accumulation steps for tokenizer |
| adam_weight_decay | float | 0.1 | Weight decay for AdamW optimizer |
| adam_beta1 | float | 0.9 | AdamW beta1 |
| adam_beta2 | float | 0.95 | AdamW beta2 |
| log_interval | int | 50 | Steps between log messages |
| num_workers | int | 6 | DataLoader worker count |
| seed | int | 100 | Random seed |
Model Path Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
| pretrained_tokenizer_path | str | -- | Path to pretrained tokenizer checkpoint |
| pretrained_predictor_path | str | -- | Path to pretrained predictor checkpoint |
| base_save_path | str | -- | Root directory for saving finetuned models |
| finetuned_tokenizer_path | str | -- | Path to the finetuned tokenizer (auto-resolved) |
| exp_name | str | "default_experiment" | Experiment name used for path generation |
Computed Path Attributes
| Attribute | Computation | Description |
|---|---|---|
| tokenizer_save_path | os.path.join(base_save_path, tokenizer_save_name) |
Directory for tokenizer training outputs |
| tokenizer_best_model_path | os.path.join(tokenizer_save_path, 'best_model') |
Path to best tokenizer checkpoint |
| basemodel_save_path | os.path.join(base_save_path, basemodel_save_name) |
Directory for basemodel training outputs |
| basemodel_best_model_path | os.path.join(basemodel_save_path, 'best_model') |
Path to best basemodel checkpoint |
Experiment Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
| train_tokenizer | bool | True | Whether to run the tokenizer training phase |
| train_basemodel | bool | True | Whether to run the basemodel training phase |
| skip_existing | bool | False | Skip training if model already exists on disk |
| pre_trained_tokenizer | bool | True | Load pretrained tokenizer weights (False = random init) |
| pre_trained_predictor | bool | True | Load pretrained predictor weights (False = random init) |
| use_comet | bool | False | Enable Comet ML experiment tracking |
ConfigLoader Helper (L6-106)
The internal ConfigLoader class handles YAML loading and dynamic path resolution:
class ConfigLoader:
def __init__(self, config_path: str):
self.config_path = config_path
self.config = self._load_config()
def _load_config(self) -> Dict[str, Any]:
# Validates file exists, loads YAML, resolves dynamic paths
...
def _resolve_dynamic_paths(self, config: Dict[str, Any]) -> Dict[str, Any]:
# Uses exp_name and base_path to auto-fill empty path values
# Or replaces {exp_name} template placeholders in path strings
...
def get(self, key: str, default=None):
# Dot-notation access: loader.get('data.lookback_window')
...
def get_data_config(self) -> Dict[str, Any]: ...
def get_training_config(self) -> Dict[str, Any]: ...
def get_model_paths(self) -> Dict[str, str]: ...
def get_experiment_config(self) -> Dict[str, Any]: ...
def update_config(self, updates: Dict[str, Any]): ...
def save_config(self, save_path: str = None): ...
Example YAML Configuration
The template config file is located at finetune_csv/configs/config_ali09988_candle-5min.yaml:
data:
data_path: "/xxxx/Kronos/finetune_csv/data/HK_ali_09988_kline_5min_all.csv"
lookback_window: 512
predict_window: 48
max_context: 512
clip: 5.0
train_ratio: 0.9
val_ratio: 0.1
test_ratio: 0.0
training:
tokenizer_epochs: 30
basemodel_epochs: 20
batch_size: 32
log_interval: 50
num_workers: 6
seed: 42
tokenizer_learning_rate: 0.0002
predictor_learning_rate: 0.000001
adam_beta1: 0.9
adam_beta2: 0.95
adam_weight_decay: 0.1
accumulation_steps: 1
model_paths:
pretrained_tokenizer: "/xxx/Kronos/pretrained/Kronos-Tokenizer-base"
pretrained_predictor: "/xxx/Kronos/pretrained/Kronos-base"
exp_name: "HK_ali_09988_kline_5min_all"
base_path: "/xxx/Kronos/finetune_csv/finetuned/"
base_save_path: ""
finetuned_tokenizer: ""
tokenizer_save_name: "tokenizer"
basemodel_save_name: "basemodel"
experiment:
name: "kronos_custom_finetune"
description: "Custom finetune for HK stock data"
use_comet: false
train_tokenizer: true
train_basemodel: true
skip_existing: false
device:
use_cuda: true
device_id: 0
Usage Example
from config_loader import CustomFinetuneConfig
# Load configuration from YAML
config = CustomFinetuneConfig("configs/config_ali09988_candle-5min.yaml")
# Access typed attributes directly
print(config.data_path) # "/xxxx/.../HK_ali_09988_kline_5min_all.csv"
print(config.lookback_window) # 512
print(config.tokenizer_epochs) # 30
print(config.tokenizer_best_model_path) # auto-computed path
# Get sub-config dictionaries for training functions
tok_config = config.get_tokenizer_config()
base_config = config.get_basemodel_config()
# Print summary
config.print_config_summary()
Convenience Methods
- get_tokenizer_config() -- Returns a flat dictionary suitable for passing to the tokenizer training function.
- get_basemodel_config() -- Returns a flat dictionary suitable for passing to the basemodel training function.
- print_config_summary() -- Prints a formatted summary of all key configuration values.
See Also
- Principle:Shiyu_coder_Kronos_CSV_Finetuning_Configuration -- The principle this implementation realizes
- Implementation:Shiyu_coder_Kronos_SequentialTrainer_Usage -- Training pipeline that consumes CustomFinetuneConfig
Environment & Heuristic Links
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment