Implementation:Hiyouga LLaMA Factory Unsloth Integration

Knowledge Sources	Hiyouga_LLaMA_Factory
Domains	Training Optimization, PEFT
Last Updated	2026-02-06 19:00 GMT

Overview

Provides integration with the Unsloth library for optimized model loading and PEFT (Parameter-Efficient Fine-Tuning) training.

Description

The unsloth module wraps Unsloth's FastLanguageModel API to enable faster and more memory-efficient fine-tuning. It provides three public functions: load_unsloth_pretrained_model loads a base model with Unsloth optimizations including 4-bit quantization and gradient checkpointing; get_unsloth_peft_model creates a PEFT adapter on top of an Unsloth-loaded model; and load_unsloth_peft_model handles loading existing PEFT checkpoints for both training and inference modes. An internal helper _get_unsloth_kwargs constructs the common configuration dictionary for Unsloth calls. The module falls back gracefully when the model type is not supported by Unsloth.

Usage

Use this module when use_unsloth is enabled in the model arguments. It is invoked during the model loading pipeline to leverage Unsloth's optimized kernels and training loop enhancements for supported model architectures.

Code Reference

Source Location

Repository: Hiyouga_LLaMA_Factory
File: src/llamafactory/model/model_utils/unsloth.py
Lines: 1-103

Signature

def _get_unsloth_kwargs(
    config: "PretrainedConfig",
    model_name_or_path: str,
    model_args: "ModelArguments",
    finetuning_args: "FinetuningArguments",
) -> dict[str, Any]

def load_unsloth_pretrained_model(
    config: "PretrainedConfig",
    model_args: "ModelArguments",
    finetuning_args: "FinetuningArguments",
) -> Optional["PreTrainedModel"]

def get_unsloth_peft_model(
    model: "PreTrainedModel",
    model_args: "ModelArguments",
    peft_kwargs: dict[str, Any],
) -> "PreTrainedModel"

def load_unsloth_peft_model(
    config: "PretrainedConfig",
    model_args: "ModelArguments",
    finetuning_args: "FinetuningArguments",
    is_trainable: bool,
) -> "PreTrainedModel"

Import

from llamafactory.model.model_utils.unsloth import (
    load_unsloth_pretrained_model,
    get_unsloth_peft_model,
    load_unsloth_peft_model,
)

I/O Contract

Inputs

Name	Type	Required	Description
config	PretrainedConfig	Yes	Model configuration, used to extract rope_scaling and detect model_type
model_args	ModelArguments	Yes	Contains model_name_or_path, model_max_length, compute_dtype, quantization_bit, hf_hub_token, trust_remote_code
finetuning_args	FinetuningArguments	Yes	Contains finetuning_type (full or lora) to configure Unsloth accordingly
is_trainable	bool	Conditional	Required by load_unsloth_peft_model; controls gradient checkpointing and inference mode
model	PreTrainedModel	Conditional	Required by get_unsloth_peft_model; the base model to wrap with PEFT
peft_kwargs	dict[str, Any]	Conditional	Required by get_unsloth_peft_model; PEFT configuration arguments

Outputs

Name	Type	Description
model	Optional[PreTrainedModel]	The Unsloth-optimized model, or None if the model type is unsupported (for load_unsloth_pretrained_model)
model	PreTrainedModel	The PEFT-wrapped model (for get_unsloth_peft_model and load_unsloth_peft_model)

Usage Examples

# Loading a pretrained model with Unsloth optimizations
from llamafactory.model.model_utils.unsloth import load_unsloth_pretrained_model

model = load_unsloth_pretrained_model(config, model_args, finetuning_args)
if model is None:
    # Fall back to standard model loading
    pass

Related Pages

Hiyouga_LLaMA_Factory_Model_Patcher - Checks use_unsloth flag during model patching

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment