Implementation:Ollama Ollama Convert Gemma3n
| Knowledge Sources | |
|---|---|
| Domains | Model Conversion, GGUF Format |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements the GGUF model converter for the Google Gemma 3n architecture, handling AltUp (Alternating Updates) parameters, activation sparsity quantile computation, and coefficient clipping.
Description
The gemma3nModel struct implements ModelConverter for the Gemma 3n architecture, which features the AltUp mechanism for efficient inference. The KV method computes activation sparsity scales by converting sparsity pattern values to normal distribution quantiles using gonum's statistical library, and emits AltUp-specific metadata (active index, correction scale, LR multiplier, number of inputs). It also handles shared KV layers, per-layer embedding dimensions, and sliding window attention patterns. The Tensors method merges AltUp projection tensors, applies coefficient clipping via tensor.Clamp, and filters out audio/vision tower tensors (not yet supported).
Usage
Invoked automatically when the model's architecture matches Gemma3nForConditionalGeneration.
Code Reference
Source Location
- Repository: Ollama
- File: convert/convert_gemma3n.go
- Lines: 1-165
Signature
type gemma3nModel struct {
ModelParameters
TextModel struct {
ActivationSparsityPattern []float32 `json:"activation_sparsity_pattern"`
AltupActiveIdx uint32 `json:"altup_active_idx"`
AltupCoefClip float32 `json:"altup_coef_clip"`
AltupNumInputs uint32 `json:"altup_num_inputs"`
NumKVSharedLayers uint32 `json:"num_kv_shared_layers"`
HiddenSizePerLayerInput uint32 `json:"hidden_size_per_layer_input"`
// ...
} `json:"text_config"`
}
func (m *gemma3nModel) KV(t *Tokenizer) KV
func (m *gemma3nModel) Tensors(ts []Tensor) []*ggml.Tensor
func (m *gemma3nModel) Replacements() []string
Import
import "github.com/ollama/ollama/convert"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| t | *Tokenizer | Yes | Tokenizer data for GGUF metadata |
| ts | []Tensor | Yes | Source tensors including AltUp projections and coefficient tensors |
Outputs
| Name | Type | Description |
|---|---|---|
| KV | KV | GGUF metadata with gemma3n.* keys including AltUp and sparsity parameters |
| []*ggml.Tensor | slice | Converted tensors with merged AltUp projections and clipped coefficients |
Usage Examples
// Converter registered for Gemma 3n architecture
// m := &gemma3nModel{}
// json.Unmarshal(configData, m)
// kv := m.KV(tokenizer)
// Activation sparsity scales are computed from normal distribution quantiles
// tensors := m.Tensors(sourceTensors)
// AltUp coefficient tensors are clamped to [-clip, clip]