Implementation:Ollama Ollama Convert DeepSeek2

Knowledge Sources	Ollama
Domains	Model Conversion, GGUF Format
Last Updated	2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the DeepSeek V2/V3 architecture, handling Multi-head Latent Attention (MLA) parameters, MoE expert tensor merging, and YaRN RoPE scaling.

Description

The deepseek2Model struct implements ModelConverter with extensive KV metadata covering MLA-specific parameters (QK nope/rope head dims, KV LoRA rank, Q LoRA rank, V head dim), MoE parameters (expert count, shared experts, gating function supporting softmax/sigmoid), YaRN RoPE scaling with mscale, and leading dense block count for the dense-to-MoE transition. The Tensors method merges per-expert weight tensors (mlp.experts.*.gate_proj, etc.) into consolidated tensors (ffn_gate_exps, etc.) and skips layers beyond the hidden layer count (e.g., Multi-Token Prediction layers).

Usage

Invoked automatically when the model's architecture matches DeepseekV2ForCausalLM or DeepseekV3ForCausalLM.

Code Reference

Source Location

Repository: Ollama
File: convert/convert_deepseek2.go
Lines: 1-173

Signature

type deepseek2Model struct {
    ModelParameters
    HiddenSize        uint32  `json:"hidden_size"`
    HiddenLayers      uint32  `json:"num_hidden_layers"`
    QKNopeHeadDim     uint32  `json:"qk_nope_head_dim"`
    QKRopeHeadDim     uint32  `json:"qk_rope_head_dim"`
    KVLoraRank        uint32  `json:"kv_lora_rank"`
    ExpertCount       uint32  `json:"n_routed_experts"`
    ExpertSharedCount uint32  `json:"n_shared_experts"`
}

func (p *deepseek2Model) KV(t *Tokenizer) KV
func (p *deepseek2Model) Replacements() []string
func (p *deepseek2Model) Tensors(s []Tensor) (out []*ggml.Tensor)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name	Type	Required	Description
t	*Tokenizer	Yes	Tokenizer data for GGUF metadata
s	[]Tensor	Yes	Source tensors including per-expert weights to merge

Outputs

Name	Type	Description
KV	KV	GGUF metadata with deepseek2.* keys for MLA, MoE, and RoPE parameters
[]*ggml.Tensor	slice	Converted tensors with merged expert weights

Usage Examples

// Converter registered for DeepSeek V2/V3 architectures
// m := &deepseek2Model{}
// json.Unmarshal(configData, m)
// kv := m.KV(tokenizer)
// tensors := m.Tensors(sourceTensors)

Related Pages

Principle:Ollama_Ollama_GGUF_Model_Conversion_DeepSeek2

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment