Implementation:Ggml org Llama cpp Adapter Header
| Knowledge Sources | |
|---|---|
| Domains | LoRA, Control_Vector |
| Last Updated | 2026-02-15 00:00 GMT |
Overview
Declares data structures for control vector and LoRA adapter support in llama.cpp.
Description
This header defines three key structs: `llama_adapter_cvec` manages per-layer control vector tensors with methods to retrieve and apply them to hidden states; `llama_adapter_lora_weight` holds low-rank A/B tensor pairs and computes scaling based on rank and alpha; `llama_adapter_lora` maps tensor names to LoRA weight pairs and stores metadata including activated LoRA (aLoRA) invocation tokens. A type alias `llama_adapter_loras` maps adapter pointers to their scaling factors.
Usage
Include this header when implementing or modifying adapter functionality. It defines the core adapter abstraction layer that enables both control vector steering and LoRA fine-tuning to be applied during inference without modifying base model weights.
Code Reference
Source Location
- Repository: Ggml_org_Llama_cpp
- File: src/llama-adapter.h
- Lines: 1-86
Signature
struct llama_adapter_cvec {
ggml_tensor * tensor_for(int il) const;
ggml_tensor * apply_to(ggml_context * ctx, ggml_tensor * cur, int il) const;
bool apply(const llama_model & model, const float * data, size_t len,
int32_t n_embd, int32_t il_start, int32_t il_end);
};
struct llama_adapter_lora_weight {
ggml_tensor * a = nullptr;
ggml_tensor * b = nullptr;
float get_scale(float alpha, float adapter_scale) const;
};
struct llama_adapter_lora {
std::unordered_map<std::string, llama_adapter_lora_weight> ab_map;
float alpha;
std::vector<llama_token> alora_invocation_tokens;
llama_adapter_lora_weight * get_weight(ggml_tensor * w);
uint32_t get_n_nodes() const;
};
using llama_adapter_loras = std::unordered_map<llama_adapter_lora *, float>;
Import
#include "llama-adapter.h"
// Dependencies:
#include "llama.h"
#include "ggml-cpp.h"
#include <string>
#include <unordered_map>
#include <vector>
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| il | int | Yes | Layer index for tensor_for and apply_to methods |
| data | const float * | Yes | Control vector data for llama_adapter_cvec::apply |
| n_embd | int32_t | Yes | Embedding dimension for control vector application |
| il_start / il_end | int32_t | Yes | Layer range for control vector application |
| alpha | float | Yes | LoRA scaling alpha parameter |
| adapter_scale | float | Yes | Additional adapter scaling factor |
Outputs
| Name | Type | Description |
|---|---|---|
| tensor_for return | ggml_tensor * | Per-layer control vector tensor, or nullptr if out of range |
| apply_to return | ggml_tensor * | Modified hidden state tensor with control vector applied |
| get_scale return | float | Computed LoRA scale: alpha * adapter_scale / rank |
| get_weight return | llama_adapter_lora_weight * | Pointer to the LoRA weight pair for a given base tensor |
Usage Examples
#include "llama-adapter.h"
// Apply a control vector to modify model behavior
llama_adapter_cvec cvec;
cvec.apply(model, data, len, n_embd, il_start, il_end);
ggml_tensor * modified = cvec.apply_to(ctx, hidden_state, layer_index);
// Access LoRA weights for a tensor
llama_adapter_lora lora;
auto * weight = lora.get_weight(base_tensor);
float scale = weight->get_scale(lora.alpha, adapter_scale);