Implementation:Hiyouga LLaMA Factory Train Callbacks

Knowledge Sources	Hiyouga_LLaMA_Factory
Domains	Machine Learning, Training Infrastructure
Last Updated	2026-02-06 19:00 GMT

Overview

Training callbacks for checkpoint management, progress logging, experiment tracking, and adapter conversion in LLaMA-Factory.

Description

The callbacks module defines five TrainerCallback subclasses and one utility function that provide essential cross-cutting concerns used by all training stages. FixValueHeadModelCallback separates value-head weights from decoder weights at checkpoint save time for PPO training. SaveProcessorCallback persists the processor (tokenizer + image processor) alongside model checkpoints. PissaConvertCallback handles PiSSA-to-LoRA adapter conversion at training start and end. LogCallback tracks training progress with timing, throughput, VRAM statistics, and writes JSON log files via a background thread pool, with optional Web UI integration. ReporterCallback pushes hyperparameter configurations to Weights & Biases or SwanLab at training start.

Usage

These callbacks are registered automatically by the training workflow modules. LogCallback and ReporterCallback are added for all training stages. FixValueHeadModelCallback is added for PPO training. SaveProcessorCallback is added when a processor is provided. PissaConvertCallback is added when PiSSA conversion is enabled.

Code Reference

Source Location

Repository: Hiyouga_LLaMA_Factory
File: src/llamafactory/train/callbacks.py
Lines: 1-384

Signature

def fix_valuehead_checkpoint(
    model: "AutoModelForCausalLMWithValueHead",
    output_dir: str,
    safe_serialization: bool,
) -> None:
    """Fix the valuehead checkpoint files by separating v_head weights."""

class FixValueHeadModelCallback(TrainerCallback):
    def on_save(self, args, state, control, **kwargs): ...

class SaveProcessorCallback(TrainerCallback):
    def __init__(self, processor: "ProcessorMixin") -> None: ...
    def on_save(self, args, state, control, **kwargs): ...
    def on_train_end(self, args, state, control, **kwargs): ...

class PissaConvertCallback(TrainerCallback):
    def on_train_begin(self, args, state, control, **kwargs): ...
    def on_train_end(self, args, state, control, **kwargs): ...

class LogCallback(TrainerCallback):
    def __init__(self) -> None: ...
    def on_init_end(self, args, state, control, **kwargs): ...
    def on_train_begin(self, args, state, control, **kwargs): ...
    def on_train_end(self, args, state, control, **kwargs): ...
    def on_log(self, args, state, control, **kwargs): ...
    def on_prediction_step(self, args, state, control, **kwargs): ...

class ReporterCallback(TrainerCallback):
    def __init__(
        self,
        model_args: "ModelArguments",
        data_args: "DataArguments",
        finetuning_args: "FinetuningArguments",
        generating_args: "GeneratingArguments",
    ) -> None: ...
    def on_train_begin(self, args, state, control, **kwargs): ...

Import

from llamafactory.train.callbacks import (
    FixValueHeadModelCallback,
    SaveProcessorCallback,
    PissaConvertCallback,
    LogCallback,
    ReporterCallback,
)

I/O Contract

Inputs

Name	Type	Required	Description
model	AutoModelForCausalLMWithValueHead	Yes (FixValueHead)	Value-head model whose checkpoint needs splitting
output_dir	str	Yes (FixValueHead)	Directory where checkpoint files are saved
processor	ProcessorMixin	Yes (SaveProcessor)	Tokenizer/processor to save alongside checkpoints
model_args	ModelArguments	Yes (Reporter)	Model config for experiment tracking
finetuning_args	FinetuningArguments	Yes (Reporter)	Fine-tuning config for experiment tracking

Outputs

Name	Type	Description
fix_valuehead_checkpoint	None	Side effect: splits v_head weights into separate file, saves decoder weights
LogCallback logs	JSON file	Writes training progress (loss, lr, epoch, throughput, VRAM) to trainer_log.jsonl
ReporterCallback	None	Side effect: updates wandb/swanlab config with all argument dictionaries

Usage Examples

from llamafactory.train.callbacks import LogCallback, ReporterCallback, SaveProcessorCallback

# LogCallback is typically added automatically
log_callback = LogCallback()
# It tracks: current_steps, total_steps, loss, eval_loss, lr, epoch,
# percentage, elapsed_time, remaining_time, throughput, VRAM usage

# ReporterCallback pushes config to experiment trackers
reporter = ReporterCallback(model_args, data_args, finetuning_args, generating_args)

# SaveProcessorCallback preserves processor in checkpoints
if processor is not None:
    trainer.add_callback(SaveProcessorCallback(processor))

# Callbacks are registered via trainer.add_callback() or passed to Trainer init
from transformers import Trainer
trainer = Trainer(
    model=model,
    callbacks=[log_callback, reporter],
    ...
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment