Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Ollama Ollama Imagegen VAE Tiling

From Leeroopedia
Knowledge Sources
Domains Image Generation, VAE
Last Updated 2025-02-15 00:00 GMT

Overview

Implements tiled VAE decoding with overlap blending to reduce memory usage when generating large images.

Description

The tiling.go file provides DecodeTiled for processing large latent tensors through the VAE decoder in overlapping tiles, matching the diffusers library tiling implementation. The four-phase algorithm: (1) extracts overlapping tiles from the latent tensor and decodes each independently via the provided decoder function, (2) blends adjacent tiles using linear interpolation in both vertical (blendV) and horizontal (blendH) directions, (3) calculates crop dimensions for the non-overlapping region of each tile, and (4) assembles the final image by copying pixel data from the cropped tiles into the output buffer. TilingConfig specifies tile size (64 latent pixels) and overlap (16 latent pixels = 25%). The decoded result is converted from NHWC to NCHW format and clamped to [0, 1].

Usage

Used by Z-Image and FLUX.2 pipelines when generating images larger than the tile size (512x512 pixels) to keep memory within bounds.

Code Reference

Source Location

  • Repository: Ollama
  • File: x/imagegen/vae/tiling.go
  • Lines: 1-215

Signature

type TilingConfig struct {
	TileSize int32 // Tile size in latent space (default 64)
	Overlap  int32 // Overlap in latent space (default 16 = 25%)
}

func DefaultTilingConfig() *TilingConfig

func DecodeTiled(
	latents *mlx.Array,
	cfg *TilingConfig,
	decoder func(*mlx.Array) *mlx.Array,
) *mlx.Array

Import

import "github.com/ollama/ollama/x/imagegen/vae"

I/O Contract

Inputs

Name Type Required Description
latents *mlx.Array Yes Latent tensor [1, H, W, C] in NHWC format
cfg *TilingConfig Yes Tile size and overlap configuration
decoder func(*mlx.Array) *mlx.Array Yes Single-tile decoder function

Outputs

Name Type Description
*mlx.Array *mlx.Array Decoded image [1, 3, H*8, W*8] in NCHW format, clamped to [0, 1]

Usage Examples

cfg := vae.DefaultTilingConfig() // 64 tile, 16 overlap

image := vae.DecodeTiled(latents, cfg, func(tile *mlx.Array) *mlx.Array {
    return myVAE.DecodeTile(tile)
})

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment