Implementation:Axolotl ai cloud Axolotl HFCausalTrainerBuilder Build
| Knowledge Sources | |
|---|---|
| Domains | Training, Supervised_Finetuning |
| Last Updated | 2026-02-06 23:00 GMT |
Overview
Concrete tool for building and configuring the SFT trainer instance provided by the Axolotl framework.
Description
The HFCausalTrainerBuilder class assembles all components needed for SFT training into an AxolotlTrainer instance. The build() method configures TrainingArguments from the Axolotl config, sets up data collators (with or without sample packing), registers callbacks (logging, early stopping, profiling), configures the optimizer and scheduler, and returns a ready-to-train AxolotlTrainer.
AxolotlTrainer extends HuggingFace's Trainer with mixins for: PackingMixin (sample packing support), SchedulerMixin (custom schedulers), OptimizerMixin (custom optimizers like Lion, ADOPT, ScheduleFree), RngLoaderMixin (deterministic RNG state loading), CheckpointSaveMixin (custom save logic), ActivationOffloadingMixin (CPU offloading), and DistributedParallelMixin (FSDP/TP save handling).
Usage
This implementation is used internally by Axolotl's training pipeline. The setup_trainer utility function routes to this builder when cfg.rl is not set (SFT training).
Code Reference
Source Location
- Repository: axolotl
- File: src/axolotl/core/builders/causal.py (builder), src/axolotl/core/trainers/base.py (trainer)
- Lines: causal.py L53-530 (class), L157-438 (build method); base.py L64-775 (AxolotlTrainer class)
Signature
class HFCausalTrainerBuilder:
"""Builder for the Axolotl causal language model trainer."""
def __init__(self, cfg, model, tokenizer, processor=None):
"""
Args:
cfg: Training configuration.
model: The model to train.
tokenizer: The tokenizer.
processor: Optional multimodal processor.
"""
def build(self, total_num_steps: int) -> AxolotlTrainer:
"""Build a configured AxolotlTrainer instance.
Args:
total_num_steps: Total training steps for scheduler configuration.
Returns:
AxolotlTrainer: Configured trainer ready for .train() call.
"""
class AxolotlTrainer(
PackingMixin,
SchedulerMixin,
OptimizerMixin,
RngLoaderMixin,
CheckpointSaveMixin,
ActivationOffloadingMixin,
DistributedParallelMixin,
Trainer,
):
"""Extended HuggingFace Trainer with Axolotl-specific features."""
Import
from axolotl.core.builders.causal import HFCausalTrainerBuilder
from axolotl.core.trainers.base import AxolotlTrainer
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| cfg | DictDefault | Yes | Full training config with num_epochs, learning_rate, micro_batch_size, gradient_accumulation_steps, optimizer, scheduler, sample_packing, etc. |
| model | PreTrainedModel or PeftModel | Yes | Model to train (with or without LoRA) |
| tokenizer | PreTrainedTokenizer | Yes | Tokenizer for data collation |
| processor | ProcessorMixin or None | No | Optional multimodal processor |
| total_num_steps | int | Yes (for build()) | Total training steps for LR scheduler |
Outputs
| Name | Type | Description |
|---|---|---|
| trainer | AxolotlTrainer | Configured trainer with optimizer, scheduler, data collators, and callbacks ready |
| train() returns | TrainOutput | Contains global_step, training_loss, metrics |
| checkpoints | Files | Saved to cfg.output_dir at configured intervals |
Usage Examples
Building and Running SFT Training
from axolotl.core.builders.causal import HFCausalTrainerBuilder
# Build the trainer
builder = HFCausalTrainerBuilder(cfg, model, tokenizer)
trainer = builder.build(total_num_steps=dataset_meta.total_num_steps)
# Set datasets
trainer.train_dataset = dataset_meta.train_dataset
trainer.eval_dataset = dataset_meta.eval_dataset
# Execute training
train_result = trainer.train(resume_from_checkpoint=cfg.resume_from_checkpoint)
print(f"Training loss: {train_result.training_loss}")
Using the High-Level Train Function
from axolotl.train import train
# The train function handles builder selection, trainer creation, and execution
model, tokenizer, trainer = train(cfg=cfg, dataset_meta=dataset_meta)