Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Imagegen Flux2 RoPE

From Leeroopedia
Knowledge Sources
Domains Image Generation, Positional Encoding
Last Updated 2025-02-15 00:00 GMT

Overview

Implements 4D Rotary Position Embeddings (RoPE) for the FLUX.2 Klein diffusion transformer with separate axes for time, height, width, and sequence.

Description

The rope.go file computes 4D RoPE embeddings specific to the FLUX.2 architecture, which uses four positional axes (T, H, W, L) with configurable dimensions per axis (typically [32, 32, 32, 32] for 128-dimensional heads). It provides functions to create position IDs for text tokens (T=0, H=0, W=0, L=sequence), latent image tokens (T=0, H=row, W=col, L=0), and reference image tokens (T=scale*(i+1) for image separation). The ComputeRoPE function computes sinusoidal frequencies per axis, applies repeat_interleave for dimension expansion, and concatenates across all axes. A RoPECache struct stores precomputed cos/sin values for efficient reuse.

Usage

Used by the FLUX.2 transformer during the forward pass to apply positional encodings to query and key tensors in both dual-stream and single-stream attention blocks.

Code Reference

Source Location

  • Repository: Ollama
  • File: x/imagegen/models/flux2/rope.go
  • Lines: 1-224

Signature

type RoPEConfig struct {
	Theta    int32
	AxesDims []int32
}

type RoPECache struct {
	Cos      *mlx.Array
	Sin      *mlx.Array
	TextLen  int32
	ImageLen int32
}

func PrepareTextIDs(seqLen int32) *mlx.Array
func PrepareLatentIDs(height, width int32) *mlx.Array
func PrepareImageIDs(imageHeights, imageWidths []int32, scale int32) *mlx.Array
func ComputeRoPE(ids *mlx.Array, axesDims []int32, theta int32) (*mlx.Array, *mlx.Array)

Import

import "github.com/ollama/ollama/x/imagegen/models/flux2"

I/O Contract

Inputs

Name Type Required Description
ids *mlx.Array Yes Position IDs [L, 4] with (T, H, W, L) coordinates
axesDims []int32 Yes Dimensions per axis (e.g., [32, 32, 32, 32])
theta int32 Yes Base frequency (2000 for Klein)

Outputs

Name Type Description
cos *mlx.Array Cosine embeddings [1, L, 1, head_dim]
sin *mlx.Array Sine embeddings [1, L, 1, head_dim]

Usage Examples

textIDs := flux2.PrepareTextIDs(77)
latentIDs := flux2.PrepareLatentIDs(64, 64)
allIDs := mlx.Concatenate([]*mlx.Array{textIDs, latentIDs}, 0)

cos, sin := flux2.ComputeRoPE(allIDs, []int32{32, 32, 32, 32}, 2000)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment