Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Hiyouga LLaMA Factory Training Args

From Leeroopedia
Revision as of 15:07, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Hiyouga_LLaMA_Factory_Training_Args.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Training Configuration, Distributed Training
Last Updated 2026-02-06 19:00 GMT

Overview

Extends HuggingFace's Seq2SeqTrainingArguments with LlamaFactory-specific support for Ray distributed training, FP8 mixed precision, and MCA (Megatron-Core Adapter) backends.

Description

This module defines three dataclass-based argument groups that compose into a single TrainingArguments class through multiple inheritance. RayArguments provides configuration for Ray-based distributed training including worker count, initialization kwargs, and master address/port settings. Fp8Arguments enables FP8 mixed precision training via HuggingFace Accelerate, targeting PyTorch 2.7+ on Hopper architecture GPUs. The base class is dynamically selected: when the environment variable USE_MCA is set, the module uses McaSeq2SeqTrainingArguments from mcore_adapter; otherwise it defaults to HuggingFace's standard Seq2SeqTrainingArguments.

Usage

Use this module when configuring training runs that require Ray distributed training, FP8 precision, or Megatron-Core Adapter integration. The TrainingArguments class is instantiated by the hyperparameter parser and passed to all trainers throughout the framework.

Code Reference

Source Location

Signature

@dataclass
class RayArguments:
    ray_num_workers: int = 1
    ray_init_kwargs: dict | str | None = None
    master_addr: str | None = None
    master_port: str | None = None

@dataclass
class Fp8Arguments:
    fp8: bool = False
    fp8_backend: str = "auto"
    fp8_enable_fsdp_float8_all_gather: bool = False

@dataclass
class TrainingArguments(Fp8Arguments, RayArguments, BaseTrainingArguments):
    overwrite_output_dir: bool = False

Import

from llamafactory.hparams.training_args import TrainingArguments, RayArguments, Fp8Arguments

I/O Contract

Inputs

Name Type Required Description
ray_num_workers int No (default: 1) Number of workers for Ray distributed training
ray_init_kwargs dict or str or None No Arguments passed to ray.init(); accepts JSON string or dict
master_addr str or None No Master address for init_process_group in distributed training
master_port str or None No Master port for init_process_group in distributed training
fp8 bool No (default: False) Enable FP8 mixed precision training via HuggingFace Accelerate
fp8_backend str No (default: "auto") FP8 backend selection: auto, torchao, te, or msamp
fp8_enable_fsdp_float8_all_gather bool No (default: False) Enable FP8 optimizations for FSDP2 all-gather operations

Outputs

Name Type Description
TrainingArguments instance TrainingArguments Fully composed training arguments dataclass combining Ray, FP8, and base HF training settings
use_ray bool Auto-detected attribute set in RayArguments.__post_init__ indicating whether Ray is active

Usage Examples

# Basic training arguments with default settings
from llamafactory.hparams.training_args import TrainingArguments

args = TrainingArguments(
    output_dir="./output",
    per_device_train_batch_size=4,
    learning_rate=5e-5,
)

# With Ray distributed training
args = TrainingArguments(
    output_dir="./output",
    ray_num_workers=4,
    ray_init_kwargs='{"num_cpus": 16}',
)

# With FP8 mixed precision on Hopper GPUs
args = TrainingArguments(
    output_dir="./output",
    fp8=True,
    fp8_backend="torchao",
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment