Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ggml org Llama cpp LoRA Download

From Leeroopedia
Field Value
Implementation Name LoRA Download
Doc Type External Tool Doc
Workflow LoRA_Adapter_Workflow
Step 1 of 5
Tool HuggingFace Hub CLI / huggingface_hub Python library

Overview

Description

This implementation documents the process of downloading LoRA adapter weights from the HuggingFace Hub for use with llama.cpp. LoRA adapters are distributed as small files (typically tens of megabytes) containing the low-rank decomposition matrices (A and B) for each adapted layer. The download process retrieves two essential files: adapter_model.safetensors (or adapter_model.bin) containing the weight tensors, and adapter_config.json containing the adapter metadata.

The llama.cpp conversion script (convert_lora_to_gguf.py) can also automatically resolve base model configurations from HuggingFace when the base_model_name_or_path field is set in the adapter configuration.

Usage

Users download LoRA adapters before converting them to GGUF format. This is typically done via the HuggingFace CLI or by cloning a repository with git-lfs.

Code Reference

Field Value
Source Location External tool (HuggingFace Hub)
Related Script convert_lora_to_gguf.py:308-309 (expects adapter_config.json and adapter_model.safetensors)
Import from huggingface_hub import try_to_load_from_cache (used in convert_lora_to_gguf.py:281)

The conversion script references the downloaded files directly:

lora_config = dir_lora / "adapter_config.json"
input_model = dir_lora / "adapter_model.safetensors"

I/O Contract

Direction Name Type Description
Input HuggingFace repo ID string Repository identifier in format user/model-lora
Output adapter_model.safetensors binary file Serialized LoRA weight tensors (A and B matrices) in safetensors format
Output adapter_config.json JSON file Adapter metadata including rank, alpha, base model path, and target modules
Output adapter_model.bin (alternative) binary file PyTorch serialized LoRA weights (legacy format, also supported)

adapter_config.json structure:

{
    "r": 16,
    "lora_alpha": 32,
    "base_model_name_or_path": "meta-llama/Llama-3.2-1B-Instruct",
    "target_modules": ["q_proj", "v_proj", "k_proj", "o_proj"],
    "bias": "none",
    "task_type": "CAUSAL_LM"
}

Usage Examples

Using HuggingFace CLI:

# Install huggingface-hub CLI
pip install huggingface-hub

# Download a LoRA adapter repository
huggingface-cli download user/my-lora-adapter --local-dir ./my-lora-adapter

# Verify expected files exist
ls ./my-lora-adapter/
# adapter_config.json  adapter_model.safetensors

Using git with LFS:

# Clone the adapter repository with git-lfs
git lfs install
git clone https://huggingface.co/user/my-lora-adapter

# The directory now contains the adapter files
ls ./my-lora-adapter/
# adapter_config.json  adapter_model.safetensors

Using Python huggingface_hub:

from huggingface_hub import snapshot_download

# Download the adapter to a local cache directory
local_dir = snapshot_download(repo_id="user/my-lora-adapter")
print(f"Adapter downloaded to: {local_dir}")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment