Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Axolotl ai cloud Axolotl HFCausalTrainerBuilder Build

From Leeroopedia


Knowledge Sources
Domains Training, Supervised_Finetuning
Last Updated 2026-02-06 23:00 GMT

Overview

Concrete tool for building and configuring the SFT trainer instance provided by the Axolotl framework.

Description

The HFCausalTrainerBuilder class assembles all components needed for SFT training into an AxolotlTrainer instance. The build() method configures TrainingArguments from the Axolotl config, sets up data collators (with or without sample packing), registers callbacks (logging, early stopping, profiling), configures the optimizer and scheduler, and returns a ready-to-train AxolotlTrainer.

AxolotlTrainer extends HuggingFace's Trainer with mixins for: PackingMixin (sample packing support), SchedulerMixin (custom schedulers), OptimizerMixin (custom optimizers like Lion, ADOPT, ScheduleFree), RngLoaderMixin (deterministic RNG state loading), CheckpointSaveMixin (custom save logic), ActivationOffloadingMixin (CPU offloading), and DistributedParallelMixin (FSDP/TP save handling).

Usage

This implementation is used internally by Axolotl's training pipeline. The setup_trainer utility function routes to this builder when cfg.rl is not set (SFT training).

Code Reference

Source Location

  • Repository: axolotl
  • File: src/axolotl/core/builders/causal.py (builder), src/axolotl/core/trainers/base.py (trainer)
  • Lines: causal.py L53-530 (class), L157-438 (build method); base.py L64-775 (AxolotlTrainer class)

Signature

class HFCausalTrainerBuilder:
    """Builder for the Axolotl causal language model trainer."""

    def __init__(self, cfg, model, tokenizer, processor=None):
        """
        Args:
            cfg: Training configuration.
            model: The model to train.
            tokenizer: The tokenizer.
            processor: Optional multimodal processor.
        """

    def build(self, total_num_steps: int) -> AxolotlTrainer:
        """Build a configured AxolotlTrainer instance.

        Args:
            total_num_steps: Total training steps for scheduler configuration.

        Returns:
            AxolotlTrainer: Configured trainer ready for .train() call.
        """


class AxolotlTrainer(
    PackingMixin,
    SchedulerMixin,
    OptimizerMixin,
    RngLoaderMixin,
    CheckpointSaveMixin,
    ActivationOffloadingMixin,
    DistributedParallelMixin,
    Trainer,
):
    """Extended HuggingFace Trainer with Axolotl-specific features."""

Import

from axolotl.core.builders.causal import HFCausalTrainerBuilder
from axolotl.core.trainers.base import AxolotlTrainer

I/O Contract

Inputs

Name Type Required Description
cfg DictDefault Yes Full training config with num_epochs, learning_rate, micro_batch_size, gradient_accumulation_steps, optimizer, scheduler, sample_packing, etc.
model PreTrainedModel or PeftModel Yes Model to train (with or without LoRA)
tokenizer PreTrainedTokenizer Yes Tokenizer for data collation
processor ProcessorMixin or None No Optional multimodal processor
total_num_steps int Yes (for build()) Total training steps for LR scheduler

Outputs

Name Type Description
trainer AxolotlTrainer Configured trainer with optimizer, scheduler, data collators, and callbacks ready
train() returns TrainOutput Contains global_step, training_loss, metrics
checkpoints Files Saved to cfg.output_dir at configured intervals

Usage Examples

Building and Running SFT Training

from axolotl.core.builders.causal import HFCausalTrainerBuilder

# Build the trainer
builder = HFCausalTrainerBuilder(cfg, model, tokenizer)
trainer = builder.build(total_num_steps=dataset_meta.total_num_steps)

# Set datasets
trainer.train_dataset = dataset_meta.train_dataset
trainer.eval_dataset = dataset_meta.eval_dataset

# Execute training
train_result = trainer.train(resume_from_checkpoint=cfg.resume_from_checkpoint)
print(f"Training loss: {train_result.training_loss}")

Using the High-Level Train Function

from axolotl.train import train

# The train function handles builder selection, trainer creation, and execution
model, tokenizer, trainer = train(cfg=cfg, dataset_meta=dataset_meta)

Related Pages

Implements Principle

Requires Environment

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment