Environment:Shiyu coder Kronos Qlib Data Environment
| Knowledge Sources | |
|---|---|
| Domains | Financial_Data, Infrastructure |
| Last Updated | 2026-02-09 13:47 GMT |
Overview
Microsoft Qlib environment with Chinese A-share market data (CSI300/CSI800/CSI1000) for the Kronos finetuning and backtesting pipeline.
Description
This environment provides the financial data infrastructure required by the Qlib finetuning workflow. It depends on the Microsoft Qlib library initialized with the Chinese market data provider (`REG_CN`). The data must be pre-downloaded to a local directory (default `~/.qlib/qlib_data/cn_data`). The environment supports instruments including CSI300, CSI800, and CSI1000, and provides OHLCV data through the `QlibDataLoader` interface. Processed datasets are serialized as pickle files for training.
Usage
Use this environment for the Qlib Finetuning Pipeline workflow: data preprocessing, dataset creation, tokenizer finetuning, predictor finetuning, inference, and backtesting. This is not required for basic prediction or CSV finetuning workflows.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux or macOS | Qlib supports these platforms |
| Disk | 10GB+ for Qlib CN data | Data stored at `~/.qlib/qlib_data/cn_data` by default |
| Network | Internet access for initial data download | Subsequent runs are offline |
Dependencies
Python Packages
- `qlib` (Microsoft Qlib library)
- `pickle` (standard library, for dataset serialization)
- All packages from the base PyTorch CUDA environment
Credentials
No API keys required. Qlib CN data is freely downloadable:
# Download Chinese market data
python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn
The following configuration paths must be set in `finetune/config.py`:
- `qlib_data_path`: Path to Qlib data directory (default: `~/.qlib/qlib_data/cn_data`)
- `dataset_path`: Path for processed pickle datasets (default: `./data/processed_datasets`)
- `pretrained_tokenizer_path`: HuggingFace model ID or local path to pretrained tokenizer
- `pretrained_predictor_path`: HuggingFace model ID or local path to pretrained predictor
Quick Install
# Install Qlib
pip install qlib
# Download Chinese A-share market data
python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn
Code Evidence
Qlib initialization from `finetune/qlib_data_preprocess.py:25-28`:
def initialize_qlib(self):
"""Initializes the Qlib environment."""
print("Initializing Qlib...")
qlib.init(provider_uri=self.config.qlib_data_path, region=REG_CN)
Configuration paths requiring user update from `finetune/config.py:12-13`:
# TODO: Update this path to your Qlib data directory.
self.qlib_data_path = "~/.qlib/qlib_data/cn_data"
Dataset pickle file paths from `finetune/dataset.py:41-42`:
self.data_path = f"{self.config.dataset_path}/train_data.pkl"
Instrument and benchmark mapping from `finetune/config.py:122-131`:
def _set_benchmark(self, instrument):
dt_benchmark = {
'csi800': "SH000906",
'csi1000': "SH000852",
'csi300': "SH000300",
}
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ModuleNotFoundError: No module named 'qlib'` | Qlib not installed | `pip install qlib` |
| `FileNotFoundError: ... cn_data` | Qlib data not downloaded | Run `python -m qlib.run.get_data qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn` |
| `FileNotFoundError: ... train_data.pkl` | Data not preprocessed | Run `qlib_data_preprocess.py` first to generate pickle files |
| `ValueError: Benchmark not defined for instrument` | Invalid instrument name | Use one of: `csi300`, `csi800`, `csi1000` |
Compatibility Notes
- Region: Currently hardcoded to Chinese market (`REG_CN`). To use other markets, modify the `region` parameter in Qlib initialization.
- Instruments: Supported instruments: CSI300, CSI800, CSI1000. Each maps to a specific benchmark index.
- Data freshness: Qlib data must be re-downloaded periodically for up-to-date market data.
Related Pages
- Implementation:Shiyu_coder_Kronos_Config_Init
- Implementation:Shiyu_coder_Kronos_QlibDataPreprocessor_Usage
- Implementation:Shiyu_coder_Kronos_QlibDataset_Usage
- Implementation:Shiyu_coder_Kronos_QlibBacktest_Usage
- Implementation:Shiyu_coder_Kronos_Generate_Predictions_Qlib
- Implementation:Shiyu_coder_Kronos_TopkDropoutStrategy_Usage