Implementation:Ollama Ollama Llama Adapter
| Knowledge Sources | |
|---|---|
| Domains | Model Adaptation, LoRA |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements control vector (cvec) and LoRA adapter loading, initialization, and application for modifying model behavior at inference time.
Description
For control vectors (llama_adapter_cvec): initializes per-layer tensors with appropriate buffer types, applies them as additive modifications to hidden states via apply_to using ggml_add. For LoRA adapters (llama_adapter_lora): loads LoRA weights (A and B matrices) from GGUF files, matches them to model tensors by name, validates dimensions, allocates backend buffers, and reads tensor data. The get_weight method looks up LoRA weight pairs by matching base model tensor names with LoRA suffixes.
Usage
Use this to apply LoRA adapters or control vectors to a loaded model for task-specific customization without modifying the base weights.
Code Reference
Source Location
- Repository: Ollama
- File: llama/llama.cpp/src/llama-adapter.cpp
- Lines: 1-485
Signature
ggml_tensor * llama_adapter_cvec::tensor_for(int il) const;
ggml_tensor * llama_adapter_cvec::apply_to(ggml_context * ctx, ggml_tensor * cur, int il) const;
bool llama_adapter_cvec::init(const llama_model & model);
bool llama_adapter_cvec::apply(const llama_model & model, const float * data,
size_t len, int32_t n_embd,
int32_t il_start, int32_t il_end);
llama_adapter_lora_weight * llama_adapter_lora::get_weight(ggml_tensor * w);
static void llama_adapter_lora_init_impl(llama_model & model, const char * path_lora,
llama_adapter_lora & adapter);
Import
#include "llama-adapter.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model | const llama_model & | Yes | Base model to apply adapters to |
| path_lora | const char * | Yes | Path to LoRA adapter GGUF file |
| data | const float * | Yes | Control vector data (for cvec) |
| il_start | int32_t | Yes | Starting layer index for control vector |
| il_end | int32_t | Yes | Ending layer index for control vector |
Outputs
| Name | Type | Description |
|---|---|---|
| success | bool | Whether initialization/application succeeded |
| weight | llama_adapter_lora_weight * | LoRA weight pair for a given tensor (or nullptr) |
Usage Examples
#include "llama-adapter.h"
// Apply a control vector
llama_adapter_cvec cvec;
cvec.apply(model, cvec_data, cvec_len, n_embd, 1, n_layer - 1);
// During graph build, apply to hidden state
ggml_tensor * modified = cvec.apply_to(ctx, hidden_state, layer_idx);
// Look up LoRA weights
llama_adapter_lora_weight * lora_w = adapter.get_weight(base_tensor);
if (lora_w) {
float scale = lora_w->get_scale(adapter.alpha, adapter_scale);
}