Implementation:Axolotl ai cloud Axolotl HFCausalTrainerBuilder Build

Knowledge Sources	Axolotl HuggingFace Trainer
Domains	Training, Supervised_Finetuning
Last Updated	2026-02-06 23:00 GMT

Overview

Concrete tool for building and configuring the SFT trainer instance provided by the Axolotl framework.

Description

The HFCausalTrainerBuilder class assembles all components needed for SFT training into an AxolotlTrainer instance. The build() method configures TrainingArguments from the Axolotl config, sets up data collators (with or without sample packing), registers callbacks (logging, early stopping, profiling), configures the optimizer and scheduler, and returns a ready-to-train AxolotlTrainer.

AxolotlTrainer extends HuggingFace's Trainer with mixins for: PackingMixin (sample packing support), SchedulerMixin (custom schedulers), OptimizerMixin (custom optimizers like Lion, ADOPT, ScheduleFree), RngLoaderMixin (deterministic RNG state loading), CheckpointSaveMixin (custom save logic), ActivationOffloadingMixin (CPU offloading), and DistributedParallelMixin (FSDP/TP save handling).

Usage

This implementation is used internally by Axolotl's training pipeline. The setup_trainer utility function routes to this builder when cfg.rl is not set (SFT training).

Code Reference

Source Location

Repository: axolotl
File: src/axolotl/core/builders/causal.py (builder), src/axolotl/core/trainers/base.py (trainer)
Lines: causal.py L53-530 (class), L157-438 (build method); base.py L64-775 (AxolotlTrainer class)

Signature

class HFCausalTrainerBuilder:
    """Builder for the Axolotl causal language model trainer."""

    def __init__(self, cfg, model, tokenizer, processor=None):
        """
        Args:
            cfg: Training configuration.
            model: The model to train.
            tokenizer: The tokenizer.
            processor: Optional multimodal processor.
        """

    def build(self, total_num_steps: int) -> AxolotlTrainer:
        """Build a configured AxolotlTrainer instance.

        Args:
            total_num_steps: Total training steps for scheduler configuration.

        Returns:
            AxolotlTrainer: Configured trainer ready for .train() call.
        """


class AxolotlTrainer(
    PackingMixin,
    SchedulerMixin,
    OptimizerMixin,
    RngLoaderMixin,
    CheckpointSaveMixin,
    ActivationOffloadingMixin,
    DistributedParallelMixin,
    Trainer,
):
    """Extended HuggingFace Trainer with Axolotl-specific features."""

Import

from axolotl.core.builders.causal import HFCausalTrainerBuilder
from axolotl.core.trainers.base import AxolotlTrainer

I/O Contract

Inputs

Name	Type	Required	Description
cfg	DictDefault	Yes	Full training config with num_epochs, learning_rate, micro_batch_size, gradient_accumulation_steps, optimizer, scheduler, sample_packing, etc.
model	PreTrainedModel or PeftModel	Yes	Model to train (with or without LoRA)
tokenizer	PreTrainedTokenizer	Yes	Tokenizer for data collation
processor	ProcessorMixin or None	No	Optional multimodal processor
total_num_steps	int	Yes (for build())	Total training steps for LR scheduler

Outputs

Name	Type	Description
trainer	AxolotlTrainer	Configured trainer with optimizer, scheduler, data collators, and callbacks ready
train() returns	TrainOutput	Contains global_step, training_loss, metrics
checkpoints	Files	Saved to cfg.output_dir at configured intervals

Usage Examples

Building and Running SFT Training

from axolotl.core.builders.causal import HFCausalTrainerBuilder

# Build the trainer
builder = HFCausalTrainerBuilder(cfg, model, tokenizer)
trainer = builder.build(total_num_steps=dataset_meta.total_num_steps)

# Set datasets
trainer.train_dataset = dataset_meta.train_dataset
trainer.eval_dataset = dataset_meta.eval_dataset

# Execute training
train_result = trainer.train(resume_from_checkpoint=cfg.resume_from_checkpoint)
print(f"Training loss: {train_result.training_loss}")

Using the High-Level Train Function

from axolotl.train import train

# The train function handles builder selection, trainer creation, and execution
model, tokenizer, trainer = train(cfg=cfg, dataset_meta=dataset_meta)

Related Pages

Implements Principle

Principle:Axolotl_ai_cloud_Axolotl_SFT_Training_Execution

Requires Environment

Environment:Axolotl_ai_cloud_Axolotl_CUDA_GPU

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment