Implementation:Ollama Ollama Convert Phi3

Knowledge Sources	Ollama
Domains	Model Conversion, GGUF Format
Last Updated	2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the Microsoft Phi-3 architecture, handling long/short RoPE scaling factors as additional tensor weights and computing attention scaling factors from context length ratios.

Description

The phi3Model struct implements ModelConverter for Phi-3 models with support for three RoPE scaling types: none, su/longrope (using sqrt-log attention factor), and yarn (using 0.1*log+1 attention factor). The KV method emits metadata with computed attention factor based on the ratio of max position embeddings to original max position embeddings. The Tensors method injects two additional tensors (rope_factors_long.weight and rope_factors_short.weight) containing the RoPE scaling factors from the model configuration, using sync.Once to inject them exactly once before the first layer. The ropeFactor type implements io.WriterTo for binary serialization.

Usage

Invoked automatically when the model's architecture matches Phi3ForCausalLM.

Code Reference

Source Location

Repository: Ollama
File: convert/convert_phi3.go
Lines: 1-122

Signature

type phi3Model struct {
    ModelParameters
    NumHiddenLayers uint32  `json:"num_hidden_layers"`
    HiddenSize      uint32  `json:"hidden_size"`
    RopeTheta       float32 `json:"rope_theta"`
    RopeScaling     struct {
        Type        string     `json:"type"`
        LongFactor  ropeFactor `json:"long_factor"`
        ShortFactor ropeFactor `json:"short_factor"`
    } `json:"rope_scaling"`
    MaxPositionEmbeddings         uint32 `json:"max_position_embeddings"`
    OriginalMaxPositionEmbeddings uint32 `json:"original_max_position_embeddings"`
}

type ropeFactor []float32

func (p *phi3Model) KV(t *Tokenizer) KV
func (p *phi3Model) Tensors(ts []Tensor) []*ggml.Tensor
func (p *phi3Model) Replacements() []string

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name	Type	Required	Description
t	*Tokenizer	Yes	Tokenizer data for GGUF metadata
ts	[]Tensor	Yes	Source tensors (rope factor tensors are injected)

Outputs

Name	Type	Description
KV	KV	GGUF metadata with phi3.* keys including computed RoPE attention factor
[]*ggml.Tensor	slice	Converted tensors plus injected rope_factors_long/short weight tensors

Usage Examples

// Converter registered for Phi3ForCausalLM
// RoPE factors are injected as additional weight tensors:
// rope_factors_long.weight and rope_factors_short.weight
// Attention factor is computed based on scaling type:
// longrope: sqrt(1 + log(scale) / log(orig_max_pos))
// yarn: 0.1 * log(scale) + 1.0

Related Pages

Principle:Ollama_Ollama_GGUF_Model_Conversion_Phi3

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment