Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Convert Mistral Causal

From Leeroopedia
Knowledge Sources
Domains Model Conversion, GGUF Format
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the Mistral 3 causal-only (text-only) architecture, handling the same advanced RoPE scaling as the multimodal variant but without vision encoder configuration.

Description

The mistral3CausalModel struct implements ModelConverter for the text-only variant of Mistral 3 (Mistral3ForCausalLM). It shares the same KV metadata structure as the multimodal mistral3Model but with parameters at the top level rather than nested under text_config. It supports the same advanced RoPE parameters (mscale, mscale_all_dim, beta_fast/slow, llama4_scaling_beta). The Tensors method applies the same Q/K weight repacking for interleaved head dimension reordering. The Replacements method uses a simpler namespace mapping since there is no language_model prefix.

Usage

Invoked automatically when the model's architecture matches Mistral3ForCausalLM.

Code Reference

Source Location

  • Repository: Ollama
  • File: convert/convert_mistral_causal.go
  • Lines: 1-181

Signature

type mistral3CausalModel struct {
    ModelParameters
    NumHiddenLayers       uint32  `json:"num_hidden_layers"`
    MaxPositionEmbeddings uint32  `json:"max_position_embeddings"`
    HiddenSize            uint32  `json:"hidden_size"`
    NumAttentionHeads     uint32  `json:"num_attention_heads"`
    NumKeyValueHeads      uint32  `json:"num_key_value_heads"`
    HeadDim               uint32  `json:"head_dim"`
    RopeParameters        struct { ... } `json:"rope_parameters"`
}

func (p *mistral3CausalModel) KV(t *Tokenizer) KV
func (p *mistral3CausalModel) Tensors(ts []Tensor) []*ggml.Tensor
func (p *mistral3CausalModel) Replacements() []string
func (p *mistral3CausalModel) repack(name string, data []float32, shape []uint64) ([]float32, error)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name Type Required Description
t *Tokenizer Yes Tokenizer data for GGUF metadata
ts []Tensor Yes Source tensors from the text-only Mistral model

Outputs

Name Type Description
KV KV GGUF metadata with mistral3.* keys for text model and RoPE scaling
[]*ggml.Tensor slice Converted tensors with repacked Q/K attention weights

Usage Examples

// Converter registered for Mistral3ForCausalLM (text-only)
// Same Q/K repacking as the multimodal variant
// No vision encoder or multimodal projector config

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment