Implementation:Ollama Ollama Imagegen Flux2 RoPE
| Knowledge Sources | |
|---|---|
| Domains | Image Generation, Positional Encoding |
| Last Updated | 2025-02-15 00:00 GMT |
Overview
Implements 4D Rotary Position Embeddings (RoPE) for the FLUX.2 Klein diffusion transformer with separate axes for time, height, width, and sequence.
Description
The rope.go file computes 4D RoPE embeddings specific to the FLUX.2 architecture, which uses four positional axes (T, H, W, L) with configurable dimensions per axis (typically [32, 32, 32, 32] for 128-dimensional heads). It provides functions to create position IDs for text tokens (T=0, H=0, W=0, L=sequence), latent image tokens (T=0, H=row, W=col, L=0), and reference image tokens (T=scale*(i+1) for image separation). The ComputeRoPE function computes sinusoidal frequencies per axis, applies repeat_interleave for dimension expansion, and concatenates across all axes. A RoPECache struct stores precomputed cos/sin values for efficient reuse.
Usage
Used by the FLUX.2 transformer during the forward pass to apply positional encodings to query and key tensors in both dual-stream and single-stream attention blocks.
Code Reference
Source Location
- Repository: Ollama
- File: x/imagegen/models/flux2/rope.go
- Lines: 1-224
Signature
type RoPEConfig struct {
Theta int32
AxesDims []int32
}
type RoPECache struct {
Cos *mlx.Array
Sin *mlx.Array
TextLen int32
ImageLen int32
}
func PrepareTextIDs(seqLen int32) *mlx.Array
func PrepareLatentIDs(height, width int32) *mlx.Array
func PrepareImageIDs(imageHeights, imageWidths []int32, scale int32) *mlx.Array
func ComputeRoPE(ids *mlx.Array, axesDims []int32, theta int32) (*mlx.Array, *mlx.Array)
Import
import "github.com/ollama/ollama/x/imagegen/models/flux2"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| ids | *mlx.Array | Yes | Position IDs [L, 4] with (T, H, W, L) coordinates |
| axesDims | []int32 | Yes | Dimensions per axis (e.g., [32, 32, 32, 32]) |
| theta | int32 | Yes | Base frequency (2000 for Klein) |
Outputs
| Name | Type | Description |
|---|---|---|
| cos | *mlx.Array | Cosine embeddings [1, L, 1, head_dim] |
| sin | *mlx.Array | Sine embeddings [1, L, 1, head_dim] |
Usage Examples
textIDs := flux2.PrepareTextIDs(77)
latentIDs := flux2.PrepareLatentIDs(64, 64)
allIDs := mlx.Concatenate([]*mlx.Array{textIDs, latentIDs}, 0)
cos, sin := flux2.ComputeRoPE(allIDs, []int32{32, 32, 32, 32}, 2000)