Implementation:Ollama Ollama Convert DeepSeek2
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF model converter for the DeepSeek V2/V3 architecture, handling Multi-head Latent Attention (MLA) parameters, MoE expert tensor merging, and YaRN RoPE scaling.
Description
The deepseek2Model struct implements ModelConverter with extensive KV metadata covering MLA-specific parameters (QK nope/rope head dims, KV LoRA rank, Q LoRA rank, V head dim), MoE parameters (expert count, shared experts, gating function supporting softmax/sigmoid), YaRN RoPE scaling with mscale, and leading dense block count for the dense-to-MoE transition. The Tensors method merges per-expert weight tensors (mlp.experts.*.gate_proj, etc.) into consolidated tensors (ffn_gate_exps, etc.) and skips layers beyond the hidden layer count (e.g., Multi-Token Prediction layers).
Usage
Invoked automatically when the model's architecture matches DeepseekV2ForCausalLM or DeepseekV3ForCausalLM.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_deepseek2.go
- Lines: 1-173
Signature
type deepseek2Model struct {
ModelParameters
HiddenSize uint32 `json:"hidden_size"`
HiddenLayers uint32 `json:"num_hidden_layers"`
QKNopeHeadDim uint32 `json:"qk_nope_head_dim"`
QKRopeHeadDim uint32 `json:"qk_rope_head_dim"`
KVLoraRank uint32 `json:"kv_lora_rank"`
ExpertCount uint32 `json:"n_routed_experts"`
ExpertSharedCount uint32 `json:"n_shared_experts"`
}
func (p *deepseek2Model) KV(t *Tokenizer) KV
func (p *deepseek2Model) Replacements() []string
func (p *deepseek2Model) Tensors(s []Tensor) (out []*ggml.Tensor)
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | *Tokenizer | Yes | Tokenizer data for GGUF metadata |
| s | []Tensor | Yes | Source tensors including per-expert weights to merge |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF metadata with deepseek2.* keys for MLA, MoE, and RoPE parameters |
| []*ggml.Tensor | slice | Converted tensors with merged expert weights |
Usage Examples
// Converter registered for DeepSeek V2/V3 architectures
// m := &deepseek2Model{}
// json.Unmarshal(configData, m)
// kv := m.KV(tokenizer)
// tensors := m.Tensors(sourceTensors)