Implementation:Ggml org Llama cpp Common Utils

Knowledge Sources	Ggml_org_Llama_cpp
Domains	Utilities, Infrastructure
Last Updated	2026-02-15 00:00 GMT

Overview

Implements the shared utility functions and infrastructure used across all llama.cpp example programs and tools.

Description

This file is the core shared library for llama.cpp, providing foundational utilities that nearly every tool depends on. It includes CPU detection (physical core count, math-capable cores via cpuid), time measurement helpers (common_time_meas RAII timer), model and context initialization (common_init_from_params for loading models, creating contexts, applying LoRA adapters, setting up threadpools), tokenization utilities (common_tokenize, common_token_to_piece, common_detokenize), batch management, control vector loading, GGUF metadata reading, string formatting, and filesystem helpers. It handles platform-specific details for Linux, macOS, and Windows.

Usage

Include common.h and link against the common library to use these utilities. Nearly every llama.cpp tool depends on this file for model loading, tokenization, and system configuration.

Code Reference

Source Location

Repository: Ggml_org_Llama_cpp
File: common/common.cpp
Lines: 1-1790

Signature

// RAII timer
common_time_meas::common_time_meas(int64_t & t_acc, bool disable = false);
common_time_meas::~common_time_meas();

// CPU utilities
int32_t cpu_get_num_physical_cores();
int32_t cpu_get_num_math();

// Model/context initialization
common_init_result common_init_from_params(common_params & params);

// Tokenization
llama_tokens common_tokenize(const struct llama_context * ctx,
                              const std::string & text,
                              bool add_special,
                              bool parse_special = false);
std::string common_token_to_piece(const struct llama_context * ctx,
                                   llama_token token,
                                   bool special = true);
std::string common_detokenize(const struct llama_context * ctx,
                               const llama_tokens & tokens,
                               bool special = true);

Import

#include "common.h"

I/O Contract

Inputs

Name	Type	Required	Description
params	common_params &	Yes	Configuration structure containing model path, context settings, LoRA adapters, etc.
ctx	llama_context *	Yes	Llama context pointer for tokenization functions
text	std::string	Yes	Input text for tokenization
add_special	bool	Yes	Whether to add BOS/EOS special tokens during tokenization

Outputs

Name	Type	Description
common_init_result	struct	Contains loaded model pointer, context pointer, and LoRA adapter info
llama_tokens	vector<llama_token>	Tokenized representation of input text
std::string	string	Detokenized text from token IDs
int32_t	integer	CPU core counts (physical, math-capable)

Usage Examples

#include "common.h"

// Initialize model and context from params
common_params params;
params.model.path = "model.gguf";
auto init_result = common_init_from_params(params);

// Tokenize input text
auto tokens = common_tokenize(init_result.context, "Hello, world!", true);

// Detokenize back to text
std::string text = common_detokenize(init_result.context, tokens);

// Get CPU information
int cores = cpu_get_num_physical_cores();
printf("Physical cores: %d\n", cores);

Related Pages

Principle:Ggml_org_Llama_cpp_Common_Infrastructure

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment