Implementation:Ollama Ollama Convert Qwen3
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF model converter for the Qwen3 architecture, supporting both dense and MoE variants with dynamic architecture naming, fused expert tensor splitting, and multiple RoPE scaling modes.
Description
The qwen3Model struct implements ModelConverter with dynamic architecture naming: qwen3 for dense models and qwen3moe for MoE models (based on NumExperts > 0). KV metadata uses unprefixed keys that get auto-prefixed by the accessor. It supports QK normalization, head dim configuration, and MoE parameters (expert count, experts per token, norm top-k probability). The Tensors method handles fused gate_up_exps expert tensors by splitting them into separate gate and up tensors using splitDim with transposition, and transposes down_exps dimensions. Supports yarn and mrope RoPE scaling modes. Serves as the base struct for qwen3VLModel.
Usage
Invoked automatically when the model's architecture matches Qwen3ForCausalLM or Qwen3MoeForCausalLM.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_qwen3.go
- Lines: 1-157
Signature
type qwen3Model struct {
ModelParameters
MaxPositionEmbeddings uint32 `json:"max_position_embeddings"`
HiddenSize uint32 `json:"hidden_size"`
HiddenLayers uint32 `json:"num_hidden_layers"`
HeadDim uint32 `json:"head_dim"`
NumExperts uint32 `json:"num_experts"`
NumExpertsPerToken uint32 `json:"num_experts_per_tok"`
RopeTheta float32 `json:"rope_theta"`
RMSNormEPS float32 `json:"rms_norm_eps"`
}
func (q *qwen3Model) KV(t *Tokenizer) KV
func (q *qwen3Model) Tensors(ts []Tensor) []*ggml.Tensor
func (q *qwen3Model) Replacements() []string
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | *Tokenizer | Yes | Tokenizer data for GGUF metadata |
| ts | []Tensor | Yes | Source tensors including fused gate-up expert tensors |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF metadata with qwen3/qwen3moe architecture and unprefixed keys |
| []*ggml.Tensor | slice | Converted tensors with split gate/up experts and transposed down experts |
Usage Examples
// Converter registered for Qwen3 (dense and MoE)
// Architecture is "qwen3" for dense, "qwen3moe" for MoE
// ffn_gate_up_exps is split into ffn_gate_exps + ffn_up_exps with transpose
// ffn_down_exps dimensions are transposed [E, I, H] -> [E, H, I]