Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Convert Gemma3n

From Leeroopedia
Knowledge Sources
Domains Model Conversion, GGUF Format
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the Google Gemma 3n architecture, handling AltUp (Alternating Updates) parameters, activation sparsity quantile computation, and coefficient clipping.

Description

The gemma3nModel struct implements ModelConverter for the Gemma 3n architecture, which features the AltUp mechanism for efficient inference. The KV method computes activation sparsity scales by converting sparsity pattern values to normal distribution quantiles using gonum's statistical library, and emits AltUp-specific metadata (active index, correction scale, LR multiplier, number of inputs). It also handles shared KV layers, per-layer embedding dimensions, and sliding window attention patterns. The Tensors method merges AltUp projection tensors, applies coefficient clipping via tensor.Clamp, and filters out audio/vision tower tensors (not yet supported).

Usage

Invoked automatically when the model's architecture matches Gemma3nForConditionalGeneration.

Code Reference

Source Location

  • Repository: Ollama
  • File: convert/convert_gemma3n.go
  • Lines: 1-165

Signature

type gemma3nModel struct {
    ModelParameters
    TextModel struct {
        ActivationSparsityPattern []float32 `json:"activation_sparsity_pattern"`
        AltupActiveIdx            uint32    `json:"altup_active_idx"`
        AltupCoefClip             float32   `json:"altup_coef_clip"`
        AltupNumInputs            uint32    `json:"altup_num_inputs"`
        NumKVSharedLayers         uint32    `json:"num_kv_shared_layers"`
        HiddenSizePerLayerInput   uint32    `json:"hidden_size_per_layer_input"`
        // ...
    } `json:"text_config"`
}

func (m *gemma3nModel) KV(t *Tokenizer) KV
func (m *gemma3nModel) Tensors(ts []Tensor) []*ggml.Tensor
func (m *gemma3nModel) Replacements() []string

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name Type Required Description
t *Tokenizer Yes Tokenizer data for GGUF metadata
ts []Tensor Yes Source tensors including AltUp projections and coefficient tensors

Outputs

Name Type Description
KV KV GGUF metadata with gemma3n.* keys including AltUp and sparsity parameters
[]*ggml.Tensor slice Converted tensors with merged AltUp projections and clipped coefficients

Usage Examples

// Converter registered for Gemma 3n architecture
// m := &gemma3nModel{}
// json.Unmarshal(configData, m)
// kv := m.KV(tokenizer)
// Activation sparsity scales are computed from normal distribution quantiles
// tensors := m.Tensors(sourceTensors)
// AltUp coefficient tensors are clamped to [-clip, clip]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment