Implementation:Hiyouga LLaMA Factory Finetuning Args
| Knowledge Sources | |
|---|---|
| Domains | Machine Learning, LLM Fine-Tuning |
| Last Updated | 2026-02-06 19:00 GMT |
Overview
Comprehensive dataclass hierarchy defining all fine-tuning method configurations for LLaMA-Factory.
Description
FinetuningArguments is a composite Python dataclass in the LLaMA-Factory framework that aggregates configuration parameters for every supported fine-tuning technique. It uses multiple inheritance to compose eight specialized argument dataclasses: FreezeArguments, LoraArguments, OFTArguments, RLHFArguments, GaloreArguments, ApolloArguments, BAdamArgument, and SwanLabArguments. Built on Python's standard dataclasses module, it serves as the central configuration hub that determines which fine-tuning strategy, optimizer, loss function, and training technique combination is used throughout the training pipeline.
Usage
Import FinetuningArguments when configuring a training run. It is parsed by the argument parser (get_train_args) and passed to trainers, adapter initialization, and callback setup. Individual sub-argument classes are not typically used directly.
Code Reference
Source Location
- Repository: Hiyouga_LLaMA_Factory
- File: src/llamafactory/hparams/finetuning_args.py
- Lines: 1-594
Signature
@dataclass
class FreezeArguments:
freeze_trainable_layers: int = 2
freeze_trainable_modules: str = "all"
freeze_extra_modules: str | None = None
@dataclass
class LoraArguments:
additional_target: str | None = None
lora_alpha: int | None = None
lora_dropout: float = 0.0
lora_rank: int = 8
lora_target: str = "all"
loraplus_lr_ratio: float | None = None
use_rslora: bool = False
use_dora: bool = False
pissa_init: bool = False
create_new_adapter: bool = False
@dataclass
class RLHFArguments:
pref_beta: float = 0.1
pref_loss: Literal["sigmoid", "hinge", "ipo", "kto_pair", "orpo", "simpo"] = "sigmoid"
ref_model: str | None = None
reward_model: str | None = None
@dataclass
class FinetuningArguments(
SwanLabArguments, BAdamArgument, ApolloArguments,
GaloreArguments, RLHFArguments, LoraArguments,
OFTArguments, FreezeArguments,
):
stage: Literal["pt", "sft", "rm", "ppo", "dpo", "kto"] = "sft"
finetuning_type: Literal["lora", "oft", "freeze", "full"] = "lora"
pure_bf16: bool = False
freeze_vision_tower: bool = True
freeze_multi_modal_projector: bool = True
compute_accuracy: bool = False
plot_loss: bool = False
def __post_init__(self): ...
def to_dict(self) -> dict[str, Any]: ...
Import
from llamafactory.hparams import FinetuningArguments
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| stage | str (Literal) | No (default: "sft") | Training stage: "pt", "sft", "rm", "ppo", "dpo", or "kto" |
| finetuning_type | str (Literal) | No (default: "lora") | Fine-tuning method: "lora", "oft", "freeze", or "full" |
| lora_rank | int | No (default: 8) | Intrinsic dimension for LoRA |
| lora_target | str | No (default: "all") | Comma-separated target modules for LoRA |
| pref_beta | float | No (default: 0.1) | Beta parameter for preference loss (DPO/KTO) |
| pref_loss | str (Literal) | No (default: "sigmoid") | DPO loss type: sigmoid, hinge, ipo, kto_pair, orpo, simpo |
| reward_model | str or None | No | Path to reward model for PPO training |
| freeze_vision_tower | bool | No (default: True) | Whether to freeze vision tower in MLLM training |
| pure_bf16 | bool | No (default: False) | Train in purely bf16 precision without AMP |
Outputs
| Name | Type | Description |
|---|---|---|
| FinetuningArguments instance | FinetuningArguments | Validated configuration object with all fine-tuning parameters |
| use_ref_model | bool | Derived flag indicating whether a reference model is needed (set in __post_init__) |
| to_dict() | dict[str, Any] | Serialized dictionary of all arguments with API keys masked |
Usage Examples
# Creating FinetuningArguments for LoRA DPO training
from llamafactory.hparams import FinetuningArguments
args = FinetuningArguments(
stage="dpo",
finetuning_type="lora",
lora_rank=16,
lora_target="all",
pref_beta=0.1,
pref_loss="sigmoid",
)
print(args.use_ref_model) # True (sigmoid DPO needs reference model)
print(args.lora_alpha) # 32 (auto-set to lora_rank * 2)
# Creating FinetuningArguments for full fine-tuning SFT
args_full = FinetuningArguments(
stage="sft",
finetuning_type="full",
pure_bf16=True,
)
print(args_full.to_dict())