Implementation:Ggml org Llama cpp Download

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Networking, Model_Management
Last Updated	2026-02-15 00:00 GMT

Overview

Implements model downloading from HuggingFace Hub and Docker registries with caching, manifest management, and progress reporting.

Description

This file is part of the llama.cpp common library and enables the convenient model download workflow (--hf-repo, model URLs) that allows users to fetch models directly without manual downloading. It resolves HuggingFace repo tags to specific GGUF files using the Ollama-compatible HF API (common_get_hf_file). Downloads model files to a local cache directory with atomic writes (write to .tmp then rename). It handles split GGUF models by reading metadata to discover additional shards. The ProgressBar class provides terminal-based download progress with speed and ETA. Docker registry support (common_docker_resolve_model) pulls models from OCI registries. All HTTP operations use cpp-httplib when LLAMA_USE_HTTPLIB is enabled, with bearer token authentication and custom headers.

Usage

Used when a user specifies a model via HuggingFace repo (--hf-repo) or URL (--model with an HTTP URL). The download system resolves, fetches, and caches the model files automatically.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: common/download.cpp
Lines: 1-820

Signature

// Repo name validation
static bool validate_repo_name(const std::string & repo);

// Manifest and caching
static std::string get_manifest_path(const std::string & repo, const std::string & tag);
static std::string read_file(const std::string & fname);
static void write_file(const std::string & fname, const std::string & content);
static void write_etag(const std::string & path, const std::string & etag);
static std::string read_etag(const std::string & path);

// Public API
std::pair<std::string, std::string> common_download_split_repo_tag(
    const std::string & hf_repo_with_tag);

// HuggingFace file resolution
std::string common_get_hf_file(const std::string & repo, ...);

// Docker registry resolution
std::string common_docker_resolve_model(const std::string & model, ...);

Import

#include "download.h"

I/O Contract

Inputs

Name	Type	Required	Description
repo	std::string	Yes	HuggingFace repository in "owner/repo" format, optionally with ":tag" suffix
tag	std::string	No	Model tag/revision (default: "latest")
token	std::string	No	Bearer token for authenticated downloads
cache_dir	std::string	No	Local directory for caching downloaded files (default: system cache)

Outputs

Name	Type	Description
file_path	std::string	Local filesystem path to the downloaded and cached model file
progress	terminal output	Real-time download progress bar with speed and ETA

Usage Examples

#include "download.h"
#include "common.h"

// Split a repo:tag string
auto [repo, tag] = common_download_split_repo_tag("user/model:Q4_K_M");
// repo = "user/model", tag = "Q4_K_M"

// Download is typically triggered via CLI args:
// llama-cli --hf-repo user/model -m model.gguf
// The arg parser calls the download functions internally.

// Or programmatically:
// std::string local_path = common_get_hf_file(repo, tag, cache_dir, token);

Related Pages

Principle:Ggml_org_Llama_cpp_Model_Management

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment