Implementation:Ollama Ollama Convert Mistral Causal
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF model converter for the Mistral 3 causal-only (text-only) architecture, handling the same advanced RoPE scaling as the multimodal variant but without vision encoder configuration.
Description
The mistral3CausalModel struct implements ModelConverter for the text-only variant of Mistral 3 (Mistral3ForCausalLM). It shares the same KV metadata structure as the multimodal mistral3Model but with parameters at the top level rather than nested under text_config. It supports the same advanced RoPE parameters (mscale, mscale_all_dim, beta_fast/slow, llama4_scaling_beta). The Tensors method applies the same Q/K weight repacking for interleaved head dimension reordering. The Replacements method uses a simpler namespace mapping since there is no language_model prefix.
Usage
Invoked automatically when the model's architecture matches Mistral3ForCausalLM.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_mistral_causal.go
- Lines: 1-181
Signature
type mistral3CausalModel struct {
ModelParameters
NumHiddenLayers uint32 `json:"num_hidden_layers"`
MaxPositionEmbeddings uint32 `json:"max_position_embeddings"`
HiddenSize uint32 `json:"hidden_size"`
NumAttentionHeads uint32 `json:"num_attention_heads"`
NumKeyValueHeads uint32 `json:"num_key_value_heads"`
HeadDim uint32 `json:"head_dim"`
RopeParameters struct { ... } `json:"rope_parameters"`
}
func (p *mistral3CausalModel) KV(t *Tokenizer) KV
func (p *mistral3CausalModel) Tensors(ts []Tensor) []*ggml.Tensor
func (p *mistral3CausalModel) Replacements() []string
func (p *mistral3CausalModel) repack(name string, data []float32, shape []uint64) ([]float32, error)
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | *Tokenizer | Yes | Tokenizer data for GGUF metadata |
| ts | []Tensor | Yes | Source tensors from the text-only Mistral model |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF metadata with mistral3.* keys for text model and RoPE scaling |
| []*ggml.Tensor | slice | Converted tensors with repacked Q/K attention weights |
Usage Examples
// Converter registered for Mistral3ForCausalLM (text-only)
// Same Q/K repacking as the multimodal variant
// No vision encoder or multimodal projector config