Implementation:Ggml org Llama cpp Download
| Knowledge Sources | |
|---|---|
| Domains | Networking, Model_Management |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Implements model downloading from HuggingFace Hub and Docker registries with caching, manifest management, and progress reporting.
Description
This file is part of the llama.cpp common library and enables the convenient model download workflow (--hf-repo, model URLs) that allows users to fetch models directly without manual downloading. It resolves HuggingFace repo tags to specific GGUF files using the Ollama-compatible HF API (common_get_hf_file). Downloads model files to a local cache directory with atomic writes (write to .tmp then rename). It handles split GGUF models by reading metadata to discover additional shards. The ProgressBar class provides terminal-based download progress with speed and ETA. Docker registry support (common_docker_resolve_model) pulls models from OCI registries. All HTTP operations use cpp-httplib when LLAMA_USE_HTTPLIB is enabled, with bearer token authentication and custom headers.
Usage
Used when a user specifies a model via HuggingFace repo (--hf-repo) or URL (--model with an HTTP URL). The download system resolves, fetches, and caches the model files automatically.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: common/download.cpp
- Lines: 1-820
Signature
// Repo name validation
static bool validate_repo_name(const std::string & repo);
// Manifest and caching
static std::string get_manifest_path(const std::string & repo, const std::string & tag);
static std::string read_file(const std::string & fname);
static void write_file(const std::string & fname, const std::string & content);
static void write_etag(const std::string & path, const std::string & etag);
static std::string read_etag(const std::string & path);
// Public API
std::pair<std::string, std::string> common_download_split_repo_tag(
const std::string & hf_repo_with_tag);
// HuggingFace file resolution
std::string common_get_hf_file(const std::string & repo, ...);
// Docker registry resolution
std::string common_docker_resolve_model(const std::string & model, ...);
Import
#include "download.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| repo | std::string | Yes | HuggingFace repository in "owner/repo" format, optionally with ":tag" suffix |
| tag | std::string | No | Model tag/revision (default: "latest") |
| token | std::string | No | Bearer token for authenticated downloads |
| cache_dir | std::string | No | Local directory for caching downloaded files (default: system cache) |
Outputs
| Name | Type | Description |
|---|---|---|
| file_path | std::string | Local filesystem path to the downloaded and cached model file |
| progress | terminal output | Real-time download progress bar with speed and ETA |
Usage Examples
#include "download.h"
#include "common.h"
// Split a repo:tag string
auto [repo, tag] = common_download_split_repo_tag("user/model:Q4_K_M");
// repo = "user/model", tag = "Q4_K_M"
// Download is typically triggered via CLI args:
// llama-cli --hf-repo user/model -m model.gguf
// The arg parser calls the download functions internally.
// Or programmatically:
// std::string local_path = common_get_hf_file(repo, tag, cache_dir, token);