Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Zai org CogVideo Tiled Image Upscaling

From Leeroopedia


Knowledge Sources
Domains Image_Super_Resolution, Video_Generation
Last Updated 2026-02-10 00:00 GMT

Overview

Tiled image upscaling divides high-resolution inputs into overlapping patches, processes each patch independently through a super-resolution model, and reassembles them with feathered blending to produce seamless upscaled output within fixed memory budgets.

Description

Super-resolution models trained on fixed-size patches (typically 64x64 to 512x512 pixels) cannot directly process arbitrarily large images due to GPU memory constraints. Tiled upscaling solves this by breaking the input into manageable tiles with controlled overlap regions, running each tile through the upscaling model independently, and then blending the results back together.

The key challenge is avoiding visible seams at tile boundaries. This is addressed through feathered blending: each tile's contribution is weighted by a mask that linearly ramps from 0 at the edges to 1 in the interior over the overlap region. When overlapping tiles are summed and normalized by the accumulated mask weights, the result is a smooth transition with no visible discontinuities.

The approach generalizes to arbitrary spatial dimensions, making it applicable to both 2D images and 3D volumetric data. The tile size, overlap size, and upscaling factor are all configurable parameters that allow trading off between processing speed (larger tiles), memory usage (smaller tiles), and blending quality (larger overlap).

Usage

Use tiled upscaling whenever applying a super-resolution model to images larger than the model's native training resolution, or when GPU memory is insufficient to process the full image at once. This is standard practice in production super-resolution pipelines and is particularly important for video processing where each frame must be upscaled individually.

Theoretical Basis

For a 2D image of size H×W with tile size T and overlap O, the tiling grid positions are:

pi=i(TO),i=0,1,2,,H/(TO)1

For each tile at position (py,px), the feathering mask M is constructed as:

M(y,x)=α(y)α(x)

where the ramp function along each dimension is:

α(t)={t+1Oif t<OStOif tSO1otherwise

Here O=Os is the overlap scaled by the upscale factor s, and S is the tile's output size. The final output at each pixel is:

I(y,x)=kMk(y,x)Tk(y,x)kMk(y,x)

where the sums are over all tiles k that cover position (y,x). This weighted averaging guarantees continuity across tile boundaries.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment