Implementation:Ollama Ollama Imagegen Gemma3
| Knowledge Sources | |
|---|---|
| Domains | Image Generation, LLM Inference |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the Gemma 3 text and multimodal model architecture for MLX-based inference.
Description
The gemma3.go file implements the full Gemma 3 text model with Q/K normalization, sliding window attention, and Gemma-style RMSNorm (1 + weight). It defines TextConfig with parameters for hidden size, attention heads, GQA key-value heads, and sliding window pattern. The TextModel struct includes embedding, decoder layers with pre/post attention and feed-forward norms, and tied output embeddings. The Attention struct supports both global and sliding window attention based on layer index, with precomputed norm scaling factors for efficiency. The model loads from safetensors with companion tokenizer and config files.
Usage
Used for text generation with Gemma 3 models in the standalone MLX engine, supporting both text-only and multimodal (with SigLIP vision tower) configurations.
Code Reference
Source Location
- Repository: Ollama
- File: x/imagegen/models/gemma3/gemma3.go
- Lines: 1-614
Signature
type TextConfig struct {
HiddenSize int32 `json:"hidden_size"`
NumHiddenLayers int32 `json:"num_hidden_layers"`
NumAttentionHeads int32 `json:"num_attention_heads"`
NumKeyValueHeads int32 `json:"num_key_value_heads"`
HeadDim int32 `json:"head_dim"`
SlidingWindow int32 `json:"sliding_window"`
SlidingWindowPattern int32 `json:"sliding_window_pattern"`
}
type TextModel struct {
EmbedTokens *nn.Embedding `weight:"model.embed_tokens"`
Layers []*DecoderLayer `weight:"model.layers"`
Norm *nn.RMSNorm `weight:"model.norm"`
Output *nn.Linear `weight:"-"` // Tied to EmbedTokens
}
func LoadText(modelPath string) (*TextModel, error)
func (m *TextModel) Forward(tokens *mlx.Array, caches []cache.Cache) *mlx.Array
Import
import "github.com/ollama/ollama/x/imagegen/models/gemma3"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| modelPath | string | Yes | Directory path containing model files |
| tokens | *mlx.Array | Yes | Input token IDs [B, L] |
| caches | []cache.Cache | Yes | KV caches for each layer |
Outputs
| Name | Type | Description |
|---|---|---|
| *TextModel | *TextModel | Loaded model ready for inference |
| *mlx.Array | *mlx.Array | Logits [B, L, vocab_size] |
Usage Examples
model, err := gemma3.LoadText("/path/to/gemma3")
if err != nil {
return err
}
caches := model.NewCache(0)
logits := model.Forward(tokens, caches)