Implementation:Hiyouga LLaMA Factory PT Workflow
| Knowledge Sources | |
|---|---|
| Domains | Pre-training, Language Modeling, Training Workflow |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
run_pt is the end-to-end orchestrator function for causal language model pre-training (continual pre-training).
Description
The run_pt function loads the tokenizer, template, dataset at the "pt" stage, and the model, then creates a DataCollatorForLanguageModeling with MLM disabled for causal LM training. It initializes a CustomTrainer and drives the complete pipeline: training with loss plotting, evaluation with perplexity computation from evaluation loss, metric logging, and model card creation. The function supports multiple evaluation datasets and computes per-dataset perplexity when a dictionary of eval datasets is provided.
Usage
Use run_pt when performing continual pre-training or domain-adaptive pre-training on a causal language model. This is invoked by the framework's training dispatcher when the training stage is set to "pt". It does not require generating arguments since pre-training does not involve text generation.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/train/pt/workflow.py
- Lines: 1-101
Signature
def run_pt(
model_args: "ModelArguments",
data_args: "DataArguments",
training_args: "Seq2SeqTrainingArguments",
finetuning_args: "FinetuningArguments",
callbacks: Optional[list["TrainerCallback"]] = None,
) -> None
Import
from llamafactory.train.pt.workflow import run_pt
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_args | ModelArguments | Yes | Model configuration including model path and compute dtype |
| data_args | DataArguments | Yes | Dataset configuration for pre-training data loading |
| training_args | Seq2SeqTrainingArguments | Yes | Training hyperparameters; do_train, do_eval flags control workflow phases |
| finetuning_args | FinetuningArguments | Yes | Fine-tuning settings including plot_loss flag |
| callbacks | Optional[list[TrainerCallback]] | No | Additional trainer callbacks |
Outputs
| Name | Type | Description |
|---|---|---|
| (none) | None | Side effects: saves model, metrics (including perplexity), trainer state, loss plots, and model card to output_dir |
Usage Examples
# Typical invocation for continual pre-training
from llamafactory.train.pt.workflow import run_pt
run_pt(
model_args=model_args,
data_args=data_args,
training_args=training_args,
finetuning_args=finetuning_args,
callbacks=None,
)
# After evaluation, metrics will contain eval_perplexity computed as:
# perplexity = math.exp(eval_loss)
Related Pages
- Hiyouga_LLaMA_Factory_PT_Trainer - The CustomTrainer class used internally for pre-training
- Hiyouga_LLaMA_Factory_SFT_Workflow - Supervised fine-tuning workflow, the next stage after pre-training
- Hiyouga_LLaMA_Factory_Trainer_Utils - Utility functions including create_modelcard_and_push