Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Ollama Ollama Convert Mistral

From Leeroopedia
Knowledge Sources
Domains Model Conversion, GGUF Format
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the Mistral 3 multimodal (conditional generation) architecture, handling text model, vision encoder, multimodal projector, and advanced RoPE scaling parameters.

Description

The mistral3Model struct implements ModelConverter with KV metadata for text configuration (including advanced RoPE parameters: mscale, mscale_all_dim, beta_fast/slow, llama4_scaling_beta, YaRN-style scaling), vision configuration (with per-head dim, RoPE theta, patch/image sizes, num channels), and multimodal configuration (image token index, spatial merge size, projector bias and hidden act). The Tensors method applies Q/K weight repacking (interleaved head reordering) to non-vision attention tensors. The Replacements method handles the language_model.model.* namespace prefix stripping and maps vision/multimodal projector paths.

Usage

Invoked automatically when the model's architecture matches Mistral3ForConditionalGeneration.

Code Reference

Source Location

  • Repository: Ollama
  • File: convert/convert_mistral.go
  • Lines: 1-221

Signature

type mistral3Model struct {
    ModelParameters
    ImageTokenIndex    uint32 `json:"image_token_index"`
    SpatialMergeSize   uint32 `json:"spatial_merge_size"`
    TextModel          struct {
        NumHiddenLayers   uint32  `json:"num_hidden_layers"`
        HiddenSize        uint32  `json:"hidden_size"`
        NumAttentionHeads uint32  `json:"num_attention_heads"`
        RopeParameters    struct { ... } `json:"rope_parameters"`
    } `json:"text_config"`
    VisionModel struct { ... } `json:"vision_config"`
}

func (p *mistral3Model) KV(t *Tokenizer) KV
func (p *mistral3Model) Tensors(ts []Tensor) []*ggml.Tensor
func (p *mistral3Model) Replacements() []string
func (p *mistral3Model) repack(name string, data []float32, shape []uint64) ([]float32, error)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name Type Required Description
t *Tokenizer Yes Tokenizer data for GGUF metadata
ts []Tensor Yes Source tensors from text model, vision encoder, and multimodal projector

Outputs

Name Type Description
KV KV GGUF metadata with mistral3.* keys for text, vision, and multimodal config
[]*ggml.Tensor slice Converted tensors with repacked Q/K attention weights

Usage Examples

// Converter registered for Mistral3ForConditionalGeneration
// Q/K weights in non-vision layers are repacked with interleaved head reordering
// Vision tensors pass through unchanged

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment