Implementation:Ollama Ollama Convert Llama Adapter
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF adapter converter for Llama LoRA adapters, handling tensor transposition detection, Q/K weight interleaved head reordering, and LoRA A/B tensor naming conventions.
Description
The llamaAdapter struct implements the AdapterConverter interface for converting Llama LoRA (Low-Rank Adaptation) fine-tuning adapters to GGUF format. Unlike model converters, it receives base model KV configuration (head counts) via the KV method. The Tensors method detects tensors that need transposition by comparing shape dimensions and applies either repack (head reordering only) or repackAndTranspose (head reordering plus transposition) depending on the tensor layout. Both repack functions perform interleaved head dimension reordering for Q/K attention weight LoRA tensors.
Usage
Invoked when converting LoRA adapter weights for Llama-family models. The adapter converter reads base model configuration to determine attention head counts for correct Q/K repacking.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_llama_adapter.go
- Lines: 1-170
Signature
type llamaAdapter struct {
AdapterParameters
NumAttentionHeads uint32 `json:"num_attention_heads"`
NumKeyValueHeads uint32 `json:"num_key_value_heads"`
}
func (p *llamaAdapter) KV(baseKV fs.Config) KV
func (p *llamaAdapter) Tensors(ts []Tensor) []*ggml.Tensor
func (p *llamaAdapter) Replacements() []string
func (p *llamaAdapter) repack(name string, data []float32, shape []uint64) ([]float32, error)
func (p *llamaAdapter) repackAndTranspose(name string, data []float32, shape []uint64) ([]float32, error)
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| baseKV | fs.Config | Yes | Base model configuration for reading head counts |
| ts | []Tensor | Yes | LoRA adapter tensors (lora_A and lora_B weights) |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF adapter metadata with llama architecture and head counts |
| []*ggml.Tensor | slice | Converted LoRA tensors with correct shapes and repacked Q/K weights |
Usage Examples
// Adapter converter for Llama LoRA
// a := &llamaAdapter{}
// kv := a.KV(baseModelConfig)
// tensors := a.Tensors(adapterTensors)
// Q/K LoRA_A weights get interleaved head reordering