Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hiyouga LLaMA Factory Finetuning Args

From Leeroopedia
Revision as of 15:06, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Hiyouga_LLaMA_Factory_Finetuning_Args.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Machine Learning, LLM Fine-Tuning
Last Updated 2026-02-06 19:00 GMT

Overview

Comprehensive dataclass hierarchy defining all fine-tuning method configurations for LLaMA-Factory.

Description

FinetuningArguments is a composite Python dataclass in the LLaMA-Factory framework that aggregates configuration parameters for every supported fine-tuning technique. It uses multiple inheritance to compose eight specialized argument dataclasses: FreezeArguments, LoraArguments, OFTArguments, RLHFArguments, GaloreArguments, ApolloArguments, BAdamArgument, and SwanLabArguments. Built on Python's standard dataclasses module, it serves as the central configuration hub that determines which fine-tuning strategy, optimizer, loss function, and training technique combination is used throughout the training pipeline.

Usage

Import FinetuningArguments when configuring a training run. It is parsed by the argument parser (get_train_args) and passed to trainers, adapter initialization, and callback setup. Individual sub-argument classes are not typically used directly.

Code Reference

Source Location

Signature

@dataclass
class FreezeArguments:
    freeze_trainable_layers: int = 2
    freeze_trainable_modules: str = "all"
    freeze_extra_modules: str | None = None

@dataclass
class LoraArguments:
    additional_target: str | None = None
    lora_alpha: int | None = None
    lora_dropout: float = 0.0
    lora_rank: int = 8
    lora_target: str = "all"
    loraplus_lr_ratio: float | None = None
    use_rslora: bool = False
    use_dora: bool = False
    pissa_init: bool = False
    create_new_adapter: bool = False

@dataclass
class RLHFArguments:
    pref_beta: float = 0.1
    pref_loss: Literal["sigmoid", "hinge", "ipo", "kto_pair", "orpo", "simpo"] = "sigmoid"
    ref_model: str | None = None
    reward_model: str | None = None

@dataclass
class FinetuningArguments(
    SwanLabArguments, BAdamArgument, ApolloArguments,
    GaloreArguments, RLHFArguments, LoraArguments,
    OFTArguments, FreezeArguments,
):
    stage: Literal["pt", "sft", "rm", "ppo", "dpo", "kto"] = "sft"
    finetuning_type: Literal["lora", "oft", "freeze", "full"] = "lora"
    pure_bf16: bool = False
    freeze_vision_tower: bool = True
    freeze_multi_modal_projector: bool = True
    compute_accuracy: bool = False
    plot_loss: bool = False
    def __post_init__(self): ...
    def to_dict(self) -> dict[str, Any]: ...

Import

from llamafactory.hparams import FinetuningArguments

I/O Contract

Inputs

Name Type Required Description
stage str (Literal) No (default: "sft") Training stage: "pt", "sft", "rm", "ppo", "dpo", or "kto"
finetuning_type str (Literal) No (default: "lora") Fine-tuning method: "lora", "oft", "freeze", or "full"
lora_rank int No (default: 8) Intrinsic dimension for LoRA
lora_target str No (default: "all") Comma-separated target modules for LoRA
pref_beta float No (default: 0.1) Beta parameter for preference loss (DPO/KTO)
pref_loss str (Literal) No (default: "sigmoid") DPO loss type: sigmoid, hinge, ipo, kto_pair, orpo, simpo
reward_model str or None No Path to reward model for PPO training
freeze_vision_tower bool No (default: True) Whether to freeze vision tower in MLLM training
pure_bf16 bool No (default: False) Train in purely bf16 precision without AMP

Outputs

Name Type Description
FinetuningArguments instance FinetuningArguments Validated configuration object with all fine-tuning parameters
use_ref_model bool Derived flag indicating whether a reference model is needed (set in __post_init__)
to_dict() dict[str, Any] Serialized dictionary of all arguments with API keys masked

Usage Examples

# Creating FinetuningArguments for LoRA DPO training
from llamafactory.hparams import FinetuningArguments

args = FinetuningArguments(
    stage="dpo",
    finetuning_type="lora",
    lora_rank=16,
    lora_target="all",
    pref_beta=0.1,
    pref_loss="sigmoid",
)
print(args.use_ref_model)  # True (sigmoid DPO needs reference model)
print(args.lora_alpha)     # 32 (auto-set to lora_rank * 2)

# Creating FinetuningArguments for full fine-tuning SFT
args_full = FinetuningArguments(
    stage="sft",
    finetuning_type="full",
    pure_bf16=True,
)
print(args_full.to_dict())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment