Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Imagegen GPT OSS

From Leeroopedia
Knowledge Sources
Domains Image Generation, LLM Inference
Last Updated 2025-02-15 00:00 GMT

Overview

Implements the GPT-OSS model architecture for MLX inference with custom SwiGLU activation, YaRN RoPE scaling, and optional Mixture of Experts.

Description

The gpt_oss.go file implements the GPT-OSS transformer with a custom SwiGLU activation that uses clipping (gate to [0, limit], up to [-limit, limit]) and a fixed alpha=1.702 sigmoid scaling. The SwiGLU function is compiled once as a singleton CompiledFunc for shapeless reuse across layers. The model supports YaRN RoPE frequency scaling with yarn_find_correction_dim/range for extended context, attention sinks for sliding window models, and optional MoE layers (specified via layer_types config). The Config supports sliding_window, num_local_experts, and per-layer type specification for hybrid dense/MoE architectures.

Usage

Used for text generation with GPT-OSS models in the MLX engine, supporting YaRN extended context and hybrid MoE architectures.

Code Reference

Source Location

  • Repository: Ollama
  • File: x/imagegen/models/gpt_oss/gpt_oss.go
  • Lines: 1-487

Signature

type Config struct {
	HiddenSize       int32        `json:"hidden_size"`
	NumHiddenLayers  int32        `json:"num_hidden_layers"`
	NumLocalExperts  int32        `json:"num_local_experts"`
	NumExpertsPerTok int32        `json:"num_experts_per_tok"`
	LayerTypes       []string     `json:"layer_types"`
	SwiGLULimit      float32      `json:"swiglu_limit"`
	RopeScaling      *RopeScaling `json:"rope_scaling"`
}

func swiGLU(gate, up *mlx.Array, alpha, limit float32) *mlx.Array
func ComputeYarnFreqs(dims int32, base, scalingFactor float32, origMaxPos int32, betaFast, betaSlow float32) (*mlx.Array, float32)
func getCompiledSwiGLU() *mlx.CompiledFunc

Import

import "github.com/ollama/ollama/x/imagegen/models/gpt_oss"

I/O Contract

Inputs

Name Type Required Description
modelPath string Yes Directory with model weights and config
tokens *mlx.Array Yes Input token IDs [B, L]
caches []cache.Cache Yes KV caches per layer

Outputs

Name Type Description
*mlx.Array *mlx.Array Logits [B, L, vocab_size]

Usage Examples

model, err := gpt_oss.Load("/path/to/gpt-oss-model")
if err != nil {
    return err
}

caches := model.NewCache(0)
logits := model.Forward(tokens, caches)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment