Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Convert GptOss

From Leeroopedia
Knowledge Sources
Domains Model Conversion, GGUF Format
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GGUF model converter for the GPT-OSS MoE architecture, handling MXFP4 quantized tensor conversion, interleaved gate-up expert splitting, and dual replacement strategies for HuggingFace vs. native tensor layouts.

Description

The gptossModel struct implements ModelConverter for GPT-OSS models that use sliding window attention and MoE experts. A unique feature is the mxfp4 type that handles MXFP4 (Microscaling FP4) quantized tensors, combining block and scale tensors with a nibble reordering transform. The Tensors method handles three tensor categories: MXFP4 quantized expert tensors (blocks + scales combined with nibble interleaving), interleaved gate_up_exps tensors (split into separate gate and up expert tensors using stride-2 slicing), and regular tensors. The Replacements method provides dual replacement strategies depending on whether the model uses HuggingFace or native naming conventions.

Usage

Invoked automatically when the model's architecture matches GptOssForCausalLM or similar GPT-OSS architecture identifiers.

Code Reference

Source Location

  • Repository: Ollama
  • File: convert/convert_gptoss.go
  • Lines: 1-269

Signature

type gptossModel struct {
    ModelParameters
    HiddenLayers    uint32  `json:"num_hidden_layers"`
    Experts         uint32  `json:"num_experts"`
    ExpertsPerToken uint32  `json:"experts_per_token"`
    SlidingWindow   uint32  `json:"sliding_window"`
    RopeScalingFactor float32 `json:"rope_scaling_factor"`
}

type mxfp4 struct {
    blocks, scales Tensor
}

func (m *gptossModel) KV(t *Tokenizer) KV
func (m *gptossModel) Tensors(ts []Tensor) []*ggml.Tensor
func (m *gptossModel) Replacements() []string
func (m *mxfp4) WriteTo(w io.Writer) (int64, error)

Import

import "github.com/ollama/ollama/convert"

I/O Contract

Inputs

Name Type Required Description
t *Tokenizer Yes Tokenizer data for GGUF metadata
ts []Tensor Yes Source tensors including MXFP4 blocks/scales and interleaved expert weights

Outputs

Name Type Description
KV KV GGUF metadata with gptoss.* keys including custom EOS token IDs
[]*ggml.Tensor slice Converted tensors with MXFP4 encoding and split gate/up experts

Usage Examples

// Converter registered for GPT-OSS architecture
// MXFP4 tensors are assembled from .blocks and .scales components
// with nibble reordering: a1b2c3...x7y8z9 -> 71xa82yb93zc
// gate_up_exps are split via stride-2 slicing

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment