Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Llama Adapter Types

From Leeroopedia
Knowledge Sources
Domains Model Adaptation, LoRA
Last Updated 2025-02-15 00:00 GMT

Overview

Header declaring data structures for control vector and LoRA adapter support in llama.cpp.

Description

Defines llama_adapter_cvec with per-layer tensors, buffer management, and methods to retrieve (tensor_for) and apply (apply_to) control vectors to hidden states. Defines llama_adapter_lora_weight holding A/B matrix pairs with scale computation based on rank and alpha. Defines llama_adapter_lora with a map from tensor names to weight pairs, GGUF metadata storage, activated LoRA (aLoRA) invocation tokens, and a get_weight lookup method.

Usage

Include this header when working with model adapters. Both control vectors (for steering model behavior) and LoRA adapters (for fine-tuning) depend on these type definitions.

Code Reference

Source Location

  • Repository: Ollama
  • File: llama/llama.cpp/src/llama-adapter.h
  • Lines: 1-82

Signature

struct llama_adapter_cvec {
    ggml_tensor * tensor_for(int il) const;
    ggml_tensor * apply_to(ggml_context * ctx, ggml_tensor * cur, int il) const;
    bool apply(const llama_model & model, const float * data, size_t len,
               int32_t n_embd, int32_t il_start, int32_t il_end);
private:
    bool init(const llama_model & model);
    int32_t layer_start = -1;
    int32_t layer_end   = -1;
    std::vector<ggml_tensor *> tensors;
};

struct llama_adapter_lora_weight {
    ggml_tensor * a = nullptr;
    ggml_tensor * b = nullptr;
    float get_scale(float alpha, float adapter_scale) const;
};

struct llama_adapter_lora {
    std::unordered_map<std::string, llama_adapter_lora_weight> ab_map;
    float alpha;
    std::vector<llama_token> alora_invocation_tokens;
    llama_adapter_lora_weight * get_weight(ggml_tensor * w);
};

using llama_adapter_loras = std::unordered_map<llama_adapter_lora *, float>;

Import

#include "llama-adapter.h"

I/O Contract

Inputs

Name Type Required Description
il int Yes Layer index for tensor lookup
w ggml_tensor * Yes Base model tensor to find LoRA weights for
alpha float Yes LoRA alpha scaling parameter

Outputs

Name Type Description
tensor ggml_tensor * Control vector tensor for the given layer
weight llama_adapter_lora_weight * LoRA A/B weight pair (or nullptr if not found)
scale float Computed LoRA scale based on rank and alpha

Usage Examples

#include "llama-adapter.h"

// Control vector structure
llama_adapter_cvec cvec;
ggml_tensor * cvec_tensor = cvec.tensor_for(layer_idx);

// LoRA weight lookup
llama_adapter_lora lora;
auto * weight = lora.get_weight(model_tensor);
if (weight) {
    float scale = weight->get_scale(lora.alpha, 1.0f);
    // Apply: output = base + scale * (B @ A @ input)
}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment