Implementation:Axolotl ai cloud Axolotl DPO SwanLab Full Featured Config

Knowledge Sources	Axolotl SwanLab
Domains	Training, Experiment_Tracking, RLHF
Last Updated	2026-02-07 00:00 GMT

Overview

Production-grade YAML configuration example demonstrating all SwanLab integration features combined with DPO (Direct Preference Optimization) training.

Description

The dpo-swanlab-full-featured.yml is a comprehensive reference configuration that combines DPO preference alignment training with the full suite of SwanLab experiment tracking features. It configures Llama-3-8B-Instruct with 8-bit LoRA for DPO training, and demonstrates: (1) basic SwanLab cloud experiment tracking with project/experiment naming, (2) team workspace collaboration, (3) RLHF completion table logging for qualitative analysis of chosen vs rejected responses, (4) Lark/Feishu team notifications with HMAC authentication, (5) performance profiling metrics under the profiling/ namespace, and (6) optional private deployment and model checkpointing settings. The file also serves as operational documentation with sections on expected training behavior, production checklists, and troubleshooting guides.

Usage

Use this configuration as a reference template when setting up production DPO training with SwanLab monitoring. Copy and modify the SwanLab-specific sections (use_swanlab, swanlab_project, swanlab_log_completions, Lark webhook settings) into your own training configs. Set SWANLAB_API_KEY and SWANLAB_LARK_WEBHOOK_URL as environment variables rather than in the config file.

Code Reference

Source Location

Repository: Axolotl
File: examples/swanlab/dpo-swanlab-full-featured.yml
Lines: 1-329

Signature

# Key SwanLab integration fields:
plugins:
  - axolotl.integrations.swanlab.SwanLabPlugin

use_swanlab: true
swanlab_project: dpo-production
swanlab_experiment_name: llama-3-dpo-full-featured-v1
swanlab_mode: cloud
swanlab_workspace: ml-research-team
swanlab_log_completions: true
swanlab_completion_log_interval: 100
swanlab_completion_max_buffer: 256
swanlab_log_model: false

Import

# Run with Axolotl CLI:
export SWANLAB_API_KEY=your-api-key
accelerate launch -m axolotl.cli.train examples/swanlab/dpo-swanlab-full-featured.yml

I/O Contract

Inputs

Name	Type	Required	Description
SWANLAB_API_KEY	str (env var)	Yes	SwanLab authentication token
SWANLAB_LARK_WEBHOOK_URL	str (env var)	No	Lark bot webhook for team notifications
SWANLAB_LARK_SECRET	str (env var)	No	HMAC secret for Lark webhook authentication
DPO dataset	HuggingFace dataset	Yes	Preference pairs with chosen/rejected fields

Outputs

Name	Type	Description
Trained model	Model files	Saved to output_dir with LoRA adapters
SwanLab dashboard	Web UI	Training metrics, completion tables, profiling data
Lark notifications	Messages	Training start/complete/error notifications to team chat

Usage Examples

Run Full Featured DPO Training

# Set required environment variables
export SWANLAB_API_KEY=your-api-key
export SWANLAB_LARK_WEBHOOK_URL=https://open.feishu.cn/open-apis/bot/v2/hook/xxx
export SWANLAB_LARK_SECRET=your-webhook-secret

# Launch training
accelerate launch -m axolotl.cli.train examples/swanlab/dpo-swanlab-full-featured.yml

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment