Implementation:Ollama Ollama Llama Adapter

Knowledge Sources	Ollama
Domains	Model Adaptation, LoRA
Last Updated	2025-02-15 00:00 GMT

Overview

Implements control vector (cvec) and LoRA adapter loading, initialization, and application for modifying model behavior at inference time.

Description

For control vectors (llama_adapter_cvec): initializes per-layer tensors with appropriate buffer types, applies them as additive modifications to hidden states via apply_to using ggml_add. For LoRA adapters (llama_adapter_lora): loads LoRA weights (A and B matrices) from GGUF files, matches them to model tensors by name, validates dimensions, allocates backend buffers, and reads tensor data. The get_weight method looks up LoRA weight pairs by matching base model tensor names with LoRA suffixes.

Usage

Use this to apply LoRA adapters or control vectors to a loaded model for task-specific customization without modifying the base weights.

Code Reference

Source Location

Repository: Ollama
File: llama/llama.cpp/src/llama-adapter.cpp
Lines: 1-485

Signature

ggml_tensor * llama_adapter_cvec::tensor_for(int il) const;
ggml_tensor * llama_adapter_cvec::apply_to(ggml_context * ctx, ggml_tensor * cur, int il) const;
bool llama_adapter_cvec::init(const llama_model & model);
bool llama_adapter_cvec::apply(const llama_model & model, const float * data,
                                size_t len, int32_t n_embd,
                                int32_t il_start, int32_t il_end);

llama_adapter_lora_weight * llama_adapter_lora::get_weight(ggml_tensor * w);

static void llama_adapter_lora_init_impl(llama_model & model, const char * path_lora,
                                          llama_adapter_lora & adapter);

Import

#include "llama-adapter.h"

I/O Contract

Inputs

Name	Type	Required	Description
model	const llama_model &	Yes	Base model to apply adapters to
path_lora	const char *	Yes	Path to LoRA adapter GGUF file
data	const float *	Yes	Control vector data (for cvec)
il_start	int32_t	Yes	Starting layer index for control vector
il_end	int32_t	Yes	Ending layer index for control vector

Outputs

Name	Type	Description
success	bool	Whether initialization/application succeeded
weight	llama_adapter_lora_weight *	LoRA weight pair for a given tensor (or nullptr)

Usage Examples

#include "llama-adapter.h"

// Apply a control vector
llama_adapter_cvec cvec;
cvec.apply(model, cvec_data, cvec_len, n_embd, 1, n_layer - 1);

// During graph build, apply to hidden state
ggml_tensor * modified = cvec.apply_to(ctx, hidden_state, layer_idx);

// Look up LoRA weights
llama_adapter_lora_weight * lora_w = adapter.get_weight(base_tensor);
if (lora_w) {
    float scale = lora_w->get_scale(adapter.alpha, adapter_scale);
}

Related Pages

Principle:Ollama_Ollama_Model_Adaptation

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment