Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:LaurentMazare Tch rs Transposed Convolution

From Leeroopedia
Revision as of 17:16, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/LaurentMazare_Tch_rs_Transposed_Convolution.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Deep Learning, Computer Vision, Signal Processing
Last Updated 2026-02-08 00:00 GMT

Overview

Transposed convolution maps feature maps from lower to higher spatial resolution by applying the transpose of a convolution operation, serving as the learnable upsampling component in encoder-decoder architectures.

Description

Transposed convolution (sometimes imprecisely called "deconvolution") is the gradient operation of a standard convolution with respect to its input. While a standard convolution with stride greater than 1 reduces spatial dimensions, a transposed convolution increases spatial dimensions. This makes it the natural choice for the decoder or upsampling portion of architectures that need to produce outputs at higher resolution than their intermediate representations.

The operation works by inserting zeros between input elements (for stride > 1) and then applying a standard convolution with the transposed kernel. The kernel weights are learnable parameters, distinguishing transposed convolution from fixed upsampling methods like bilinear interpolation. This allows the network to learn task-specific upsampling behavior.

Key parameters that control the output spatial dimensions include:

  • Kernel size -- The spatial extent of the learnable filter
  • Stride -- Determines the upsampling factor; a stride of 2 approximately doubles spatial dimensions
  • Padding -- Controls how the input boundaries are handled
  • Output padding -- Resolves the ambiguity when multiple input sizes map to the same output under standard convolution

Transposed convolutions are widely used in image segmentation (mapping from features back to pixel-level predictions), generative models (producing images from latent vectors), and super-resolution (increasing image resolution).

Usage

Apply transposed convolution when:

  • Upsampling feature maps in decoder networks or generative architectures
  • The upsampling operation should be learnable rather than fixed
  • Building symmetric encoder-decoder architectures where each downsampling convolution has a corresponding upsampling layer
  • Producing dense per-pixel outputs from spatially reduced feature representations

Theoretical Basis

Relationship to Standard Convolution

If a standard convolution is represented as matrix multiplication y=Cx where C is the convolution matrix, then the transposed convolution computes:

x=CTy

This is not the inverse of convolution (it does not recover the original input), but rather the transpose of the linear mapping.

Output Size Calculation

For a transposed convolution with input size i, kernel size k, stride s, padding p, and output padding op:

o=(i1)×s2p+k+op

This is the dual of the standard convolution output size formula:

i=o+2pks+1

Checkerboard Artifacts

When the kernel size is not evenly divisible by the stride, transposed convolutions can produce checkerboard artifacts -- regular patterns in the output caused by uneven overlap of the kernel at different output positions. This can be mitigated by choosing kernel sizes that are multiples of the stride, or by using resize-convolution (bilinear upsampling followed by standard convolution) as an alternative.

Multi-Dimensional Extension

Transposed convolution generalizes to 1D (temporal upsampling), 2D (spatial upsampling for images), and 3D (volumetric upsampling for video or medical imaging) in the same manner, with the output size formula applied independently along each dimension.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment