Implementation:Ollama Ollama Convert GlmOcr
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF model converter for the GLM-OCR multimodal architecture, handling tensor name mapping, Q/K weight reordering for M-RoPE compatibility, and vision encoder configuration.
Description
The glmOcrModel struct implements the ModelConverter and moreParser interfaces for converting GLM-OCR models from HuggingFace SafeTensors format to GGUF. It includes a normalToNeoXRepacker function that permutes Q/K weight rotary dimensions from interleaved (LLaMA) ordering to NeoX ordering for GGML's M-RoPE kernel compatibility. The converter handles both the language model (with MoE expert merging) and the vision encoder (SigLIP-based), mapping text, vision, and multimodal projector tensor names to their GGUF equivalents. The parseMore method reads additional vision configuration from preprocessor_config.json.
Usage
Invoked automatically by the conversion pipeline when the model's architectures field in config.json matches GlmOcrForConditionalGeneration.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_glmocr.go
- Lines: 1-455
Signature
type glmOcrModel struct {
ModelParameters
TextModel struct { ... } `json:"text_config"`
VisionModel struct { ... } `json:"vision_config"`
}
func normalToNeoXRepacker(nHeads, headDim int, partialRotaryFactor float32) func(string, []float32, []uint64) ([]float32, error)
func (m *glmOcrModel) KV(t *Tokenizer) KV
func (m *glmOcrModel) Tensors(ts []Tensor) []*ggml.Tensor
func (m *glmOcrModel) Replacements() []string
func (m *glmOcrModel) parseMore(fsys fs.FS) error
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | *Tokenizer | Yes | Tokenizer data for embedding in GGUF metadata |
| ts | []Tensor | Yes | Source model tensors to convert |
| fsys | fs.FS | Yes | Filesystem for reading additional config files |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF key-value metadata including text, vision, and MoE parameters |
| []*ggml.Tensor | slice | Converted GGUF tensors with merged experts and repacked Q/K weights |
Usage Examples
// Automatically invoked during model conversion
// The converter is registered for "GlmOcrForConditionalGeneration" architecture
// convert.Convert(fsys, "model.safetensors") triggers:
// m := &glmOcrModel{}
// json.Unmarshal(configData, m)
// m.parseMore(fsys)
// kv := m.KV(tokenizer)
// tensors := m.Tensors(sourceTensors)