Implementation:Hiyouga LLaMA Factory Finetuning Args

Knowledge Sources	Hiyouga_LLaMA_Factory
Domains	Machine Learning, LLM Fine-Tuning
Last Updated	2026-02-06 19:00 GMT

Overview

Comprehensive dataclass hierarchy defining all fine-tuning method configurations for LLaMA-Factory.

Description

FinetuningArguments is a composite Python dataclass in the LLaMA-Factory framework that aggregates configuration parameters for every supported fine-tuning technique. It uses multiple inheritance to compose eight specialized argument dataclasses: FreezeArguments, LoraArguments, OFTArguments, RLHFArguments, GaloreArguments, ApolloArguments, BAdamArgument, and SwanLabArguments. Built on Python's standard dataclasses module, it serves as the central configuration hub that determines which fine-tuning strategy, optimizer, loss function, and training technique combination is used throughout the training pipeline.

Usage

Import FinetuningArguments when configuring a training run. It is parsed by the argument parser (get_train_args) and passed to trainers, adapter initialization, and callback setup. Individual sub-argument classes are not typically used directly.

Code Reference

Source Location

Repository: Hiyouga_LLaMA_Factory
File: src/llamafactory/hparams/finetuning_args.py
Lines: 1-594

Signature

@dataclass
class FreezeArguments:
    freeze_trainable_layers: int = 2
    freeze_trainable_modules: str = "all"
    freeze_extra_modules: str | None = None

@dataclass
class LoraArguments:
    additional_target: str | None = None
    lora_alpha: int | None = None
    lora_dropout: float = 0.0
    lora_rank: int = 8
    lora_target: str = "all"
    loraplus_lr_ratio: float | None = None
    use_rslora: bool = False
    use_dora: bool = False
    pissa_init: bool = False
    create_new_adapter: bool = False

@dataclass
class RLHFArguments:
    pref_beta: float = 0.1
    pref_loss: Literal["sigmoid", "hinge", "ipo", "kto_pair", "orpo", "simpo"] = "sigmoid"
    ref_model: str | None = None
    reward_model: str | None = None

@dataclass
class FinetuningArguments(
    SwanLabArguments, BAdamArgument, ApolloArguments,
    GaloreArguments, RLHFArguments, LoraArguments,
    OFTArguments, FreezeArguments,
):
    stage: Literal["pt", "sft", "rm", "ppo", "dpo", "kto"] = "sft"
    finetuning_type: Literal["lora", "oft", "freeze", "full"] = "lora"
    pure_bf16: bool = False
    freeze_vision_tower: bool = True
    freeze_multi_modal_projector: bool = True
    compute_accuracy: bool = False
    plot_loss: bool = False
    def __post_init__(self): ...
    def to_dict(self) -> dict[str, Any]: ...

Import

from llamafactory.hparams import FinetuningArguments

I/O Contract

Inputs

Name	Type	Required	Description
stage	str (Literal)	No (default: "sft")	Training stage: "pt", "sft", "rm", "ppo", "dpo", or "kto"
finetuning_type	str (Literal)	No (default: "lora")	Fine-tuning method: "lora", "oft", "freeze", or "full"
lora_rank	int	No (default: 8)	Intrinsic dimension for LoRA
lora_target	str	No (default: "all")	Comma-separated target modules for LoRA
pref_beta	float	No (default: 0.1)	Beta parameter for preference loss (DPO/KTO)
pref_loss	str (Literal)	No (default: "sigmoid")	DPO loss type: sigmoid, hinge, ipo, kto_pair, orpo, simpo
reward_model	str or None	No	Path to reward model for PPO training
freeze_vision_tower	bool	No (default: True)	Whether to freeze vision tower in MLLM training
pure_bf16	bool	No (default: False)	Train in purely bf16 precision without AMP

Outputs

Name	Type	Description
FinetuningArguments instance	FinetuningArguments	Validated configuration object with all fine-tuning parameters
use_ref_model	bool	Derived flag indicating whether a reference model is needed (set in __post_init__)
to_dict()	dict[str, Any]	Serialized dictionary of all arguments with API keys masked

Usage Examples

# Creating FinetuningArguments for LoRA DPO training
from llamafactory.hparams import FinetuningArguments

args = FinetuningArguments(
    stage="dpo",
    finetuning_type="lora",
    lora_rank=16,
    lora_target="all",
    pref_beta=0.1,
    pref_loss="sigmoid",
)
print(args.use_ref_model)  # True (sigmoid DPO needs reference model)
print(args.lora_alpha)     # 32 (auto-set to lora_rank * 2)

# Creating FinetuningArguments for full fine-tuning SFT
args_full = FinetuningArguments(
    stage="sft",
    finetuning_type="full",
    pure_bf16=True,
)
print(args_full.to_dict())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment