Implementation:Axolotl ai cloud Axolotl DPO SwanLab Full Featured Config
| Knowledge Sources | |
|---|---|
| Domains | Training, Experiment_Tracking, RLHF |
| Last Updated | 2026-02-07 00:00 GMT |
Overview
Production-grade YAML configuration example demonstrating all SwanLab integration features combined with DPO (Direct Preference Optimization) training.
Description
The dpo-swanlab-full-featured.yml is a comprehensive reference configuration that combines DPO preference alignment training with the full suite of SwanLab experiment tracking features. It configures Llama-3-8B-Instruct with 8-bit LoRA for DPO training, and demonstrates: (1) basic SwanLab cloud experiment tracking with project/experiment naming, (2) team workspace collaboration, (3) RLHF completion table logging for qualitative analysis of chosen vs rejected responses, (4) Lark/Feishu team notifications with HMAC authentication, (5) performance profiling metrics under the profiling/ namespace, and (6) optional private deployment and model checkpointing settings. The file also serves as operational documentation with sections on expected training behavior, production checklists, and troubleshooting guides.
Usage
Use this configuration as a reference template when setting up production DPO training with SwanLab monitoring. Copy and modify the SwanLab-specific sections (use_swanlab, swanlab_project, swanlab_log_completions, Lark webhook settings) into your own training configs. Set SWANLAB_API_KEY and SWANLAB_LARK_WEBHOOK_URL as environment variables rather than in the config file.
Code Reference
Source Location
- Repository: Axolotl
- File: examples/swanlab/dpo-swanlab-full-featured.yml
- Lines: 1-329
Signature
# Key SwanLab integration fields:
plugins:
- axolotl.integrations.swanlab.SwanLabPlugin
use_swanlab: true
swanlab_project: dpo-production
swanlab_experiment_name: llama-3-dpo-full-featured-v1
swanlab_mode: cloud
swanlab_workspace: ml-research-team
swanlab_log_completions: true
swanlab_completion_log_interval: 100
swanlab_completion_max_buffer: 256
swanlab_log_model: false
Import
# Run with Axolotl CLI:
export SWANLAB_API_KEY=your-api-key
accelerate launch -m axolotl.cli.train examples/swanlab/dpo-swanlab-full-featured.yml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| SWANLAB_API_KEY | str (env var) | Yes | SwanLab authentication token |
| SWANLAB_LARK_WEBHOOK_URL | str (env var) | No | Lark bot webhook for team notifications |
| SWANLAB_LARK_SECRET | str (env var) | No | HMAC secret for Lark webhook authentication |
| DPO dataset | HuggingFace dataset | Yes | Preference pairs with chosen/rejected fields |
Outputs
| Name | Type | Description |
|---|---|---|
| Trained model | Model files | Saved to output_dir with LoRA adapters |
| SwanLab dashboard | Web UI | Training metrics, completion tables, profiling data |
| Lark notifications | Messages | Training start/complete/error notifications to team chat |
Usage Examples
Run Full Featured DPO Training
# Set required environment variables
export SWANLAB_API_KEY=your-api-key
export SWANLAB_LARK_WEBHOOK_URL=https://open.feishu.cn/open-apis/bot/v2/hook/xxx
export SWANLAB_LARK_SECRET=your-webhook-secret
# Launch training
accelerate launch -m axolotl.cli.train examples/swanlab/dpo-swanlab-full-featured.yml