Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Convert Llama Adapter

From Leeroopedia
Knowledge Sources
Domains Model Conversion, GGUF Format
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GGUF adapter converter for Llama LoRA adapters, handling tensor transposition detection, Q/K weight interleaved head reordering, and LoRA A/B tensor naming conventions.

Description

The llamaAdapter struct implements the AdapterConverter interface for converting Llama LoRA (Low-Rank Adaptation) fine-tuning adapters to GGUF format. Unlike model converters, it receives base model KV configuration (head counts) via the KV method. The Tensors method detects tensors that need transposition by comparing shape dimensions and applies either repack (head reordering only) or repackAndTranspose (head reordering plus transposition) depending on the tensor layout. Both repack functions perform interleaved head dimension reordering for Q/K attention weight LoRA tensors.

Usage

Invoked when converting LoRA adapter weights for Llama-family models. The adapter converter reads base model configuration to determine attention head counts for correct Q/K repacking.

Code Reference

Source Location

  • Repository: Ollama
  • File: convert/convert_llama_adapter.go
  • Lines: 1-170

Signature

type llamaAdapter struct {
    AdapterParameters
    NumAttentionHeads uint32 `json:"num_attention_heads"`
    NumKeyValueHeads  uint32 `json:"num_key_value_heads"`
}

func (p *llamaAdapter) KV(baseKV fs.Config) KV
func (p *llamaAdapter) Tensors(ts []Tensor) []*ggml.Tensor
func (p *llamaAdapter) Replacements() []string
func (p *llamaAdapter) repack(name string, data []float32, shape []uint64) ([]float32, error)
func (p *llamaAdapter) repackAndTranspose(name string, data []float32, shape []uint64) ([]float32, error)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name Type Required Description
baseKV fs.Config Yes Base model configuration for reading head counts
ts []Tensor Yes LoRA adapter tensors (lora_A and lora_B weights)

Outputs

Name Type Description
KV KV GGUF adapter metadata with llama architecture and head counts
[]*ggml.Tensor slice Converted LoRA tensors with correct shapes and repacked Q/K weights

Usage Examples

// Adapter converter for Llama LoRA
// a := &llamaAdapter{}
// kv := a.KV(baseModelConfig)
// tensors := a.Tensors(adapterTensors)
// Q/K LoRA_A weights get interleaved head reordering

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment