Implementation:Ollama Ollama Convert Mistral Causal

Knowledge Sources	Ollama
Domains	Model Conversion, GGUF Format
Last Updated	2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the Mistral 3 causal-only (text-only) architecture, handling the same advanced RoPE scaling as the multimodal variant but without vision encoder configuration.

Description

The mistral3CausalModel struct implements ModelConverter for the text-only variant of Mistral 3 (Mistral3ForCausalLM). It shares the same KV metadata structure as the multimodal mistral3Model but with parameters at the top level rather than nested under text_config. It supports the same advanced RoPE parameters (mscale, mscale_all_dim, beta_fast/slow, llama4_scaling_beta). The Tensors method applies the same Q/K weight repacking for interleaved head dimension reordering. The Replacements method uses a simpler namespace mapping since there is no language_model prefix.

Usage

Invoked automatically when the model's architecture matches Mistral3ForCausalLM.

Code Reference

Source Location

Repository: Ollama
File: convert/convert_mistral_causal.go
Lines: 1-181

Signature

type mistral3CausalModel struct {
    ModelParameters
    NumHiddenLayers       uint32  `json:"num_hidden_layers"`
    MaxPositionEmbeddings uint32  `json:"max_position_embeddings"`
    HiddenSize            uint32  `json:"hidden_size"`
    NumAttentionHeads     uint32  `json:"num_attention_heads"`
    NumKeyValueHeads      uint32  `json:"num_key_value_heads"`
    HeadDim               uint32  `json:"head_dim"`
    RopeParameters        struct { ... } `json:"rope_parameters"`
}

func (p *mistral3CausalModel) KV(t *Tokenizer) KV
func (p *mistral3CausalModel) Tensors(ts []Tensor) []*ggml.Tensor
func (p *mistral3CausalModel) Replacements() []string
func (p *mistral3CausalModel) repack(name string, data []float32, shape []uint64) ([]float32, error)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name	Type	Required	Description
t	*Tokenizer	Yes	Tokenizer data for GGUF metadata
ts	[]Tensor	Yes	Source tensors from the text-only Mistral model

Outputs

Name	Type	Description
KV	KV	GGUF metadata with mistral3.* keys for text model and RoPE scaling
[]*ggml.Tensor	slice	Converted tensors with repacked Q/K attention weights

Usage Examples

// Converter registered for Mistral3ForCausalLM (text-only)
// Same Q/K repacking as the multimodal variant
// No vision encoder or multimodal projector config

Related Pages

Principle:Ollama_Ollama_GGUF_Model_Conversion_Mistral_Causal

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment