Implementation:Ollama Ollama Llama Adapter Types
| Knowledge Sources | |
|---|---|
| Domains | Model Adaptation, LoRA |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Header declaring data structures for control vector and LoRA adapter support in llama.cpp.
Description
Defines llama_adapter_cvec with per-layer tensors, buffer management, and methods to retrieve (tensor_for) and apply (apply_to) control vectors to hidden states. Defines llama_adapter_lora_weight holding A/B matrix pairs with scale computation based on rank and alpha. Defines llama_adapter_lora with a map from tensor names to weight pairs, GGUF metadata storage, activated LoRA (aLoRA) invocation tokens, and a get_weight lookup method.
Usage
Include this header when working with model adapters. Both control vectors (for steering model behavior) and LoRA adapters (for fine-tuning) depend on these type definitions.
Code Reference
Source Location
- Repository: Ollama
- File: llama/llama.cpp/src/llama-adapter.h
- Lines: 1-82
Signature
struct llama_adapter_cvec {
ggml_tensor * tensor_for(int il) const;
ggml_tensor * apply_to(ggml_context * ctx, ggml_tensor * cur, int il) const;
bool apply(const llama_model & model, const float * data, size_t len,
int32_t n_embd, int32_t il_start, int32_t il_end);
private:
bool init(const llama_model & model);
int32_t layer_start = -1;
int32_t layer_end = -1;
std::vector<ggml_tensor *> tensors;
};
struct llama_adapter_lora_weight {
ggml_tensor * a = nullptr;
ggml_tensor * b = nullptr;
float get_scale(float alpha, float adapter_scale) const;
};
struct llama_adapter_lora {
std::unordered_map<std::string, llama_adapter_lora_weight> ab_map;
float alpha;
std::vector<llama_token> alora_invocation_tokens;
llama_adapter_lora_weight * get_weight(ggml_tensor * w);
};
using llama_adapter_loras = std::unordered_map<llama_adapter_lora *, float>;
Import
#include "llama-adapter.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| il | int | Yes | Layer index for tensor lookup |
| w | ggml_tensor * | Yes | Base model tensor to find LoRA weights for |
| alpha | float | Yes | LoRA alpha scaling parameter |
Outputs
| Name | Type | Description |
|---|---|---|
| tensor | ggml_tensor * | Control vector tensor for the given layer |
| weight | llama_adapter_lora_weight * | LoRA A/B weight pair (or nullptr if not found) |
| scale | float | Computed LoRA scale based on rank and alpha |
Usage Examples
#include "llama-adapter.h"
// Control vector structure
llama_adapter_cvec cvec;
ggml_tensor * cvec_tensor = cvec.tensor_for(layer_idx);
// LoRA weight lookup
llama_adapter_lora lora;
auto * weight = lora.get_weight(model_tensor);
if (weight) {
float scale = weight->get_scale(lora.alpha, 1.0f);
// Apply: output = base + scale * (B @ A @ input)
}