Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hiyouga LLaMA Factory PT Workflow

From Leeroopedia


Knowledge Sources
Domains Pre-training, Language Modeling, Training Workflow
Last Updated 2026-02-06 19:00 GMT

Overview

run_pt is the end-to-end orchestrator function for causal language model pre-training (continual pre-training).

Description

The run_pt function loads the tokenizer, template, dataset at the "pt" stage, and the model, then creates a DataCollatorForLanguageModeling with MLM disabled for causal LM training. It initializes a CustomTrainer and drives the complete pipeline: training with loss plotting, evaluation with perplexity computation from evaluation loss, metric logging, and model card creation. The function supports multiple evaluation datasets and computes per-dataset perplexity when a dictionary of eval datasets is provided.

Usage

Use run_pt when performing continual pre-training or domain-adaptive pre-training on a causal language model. This is invoked by the framework's training dispatcher when the training stage is set to "pt". It does not require generating arguments since pre-training does not involve text generation.

Code Reference

Source Location

Signature

def run_pt(
    model_args: "ModelArguments",
    data_args: "DataArguments",
    training_args: "Seq2SeqTrainingArguments",
    finetuning_args: "FinetuningArguments",
    callbacks: Optional[list["TrainerCallback"]] = None,
) -> None

Import

from llamafactory.train.pt.workflow import run_pt

I/O Contract

Inputs

Name Type Required Description
model_args ModelArguments Yes Model configuration including model path and compute dtype
data_args DataArguments Yes Dataset configuration for pre-training data loading
training_args Seq2SeqTrainingArguments Yes Training hyperparameters; do_train, do_eval flags control workflow phases
finetuning_args FinetuningArguments Yes Fine-tuning settings including plot_loss flag
callbacks Optional[list[TrainerCallback]] No Additional trainer callbacks

Outputs

Name Type Description
(none) None Side effects: saves model, metrics (including perplexity), trainer state, loss plots, and model card to output_dir

Usage Examples

# Typical invocation for continual pre-training
from llamafactory.train.pt.workflow import run_pt

run_pt(
    model_args=model_args,
    data_args=data_args,
    training_args=training_args,
    finetuning_args=finetuning_args,
    callbacks=None,
)

# After evaluation, metrics will contain eval_perplexity computed as:
# perplexity = math.exp(eval_loss)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment