Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Junyanz Pytorch CycleGAN and pix2pix Image Pool Stabilization

From Leeroopedia


Metadata
Knowledge Sources pytorch-CycleGAN-and-pix2pix, Learning from Simulated and Unsupervised Images through Adversarial Training
Domains Image-to-Image Translation, GAN Training Stabilization, Generative Adversarial Networks
Last Updated 2026-02-09

Overview

A training stabilization technique that maintains a buffer of previously generated images to reduce discriminator oscillation during GAN training.

Description

The image pool technique addresses a fundamental instability in GAN training: when the discriminator is updated only on the most recently generated images, it can oscillate rapidly as the generator changes. By maintaining a fixed-size buffer (pool of size 50) of previously generated images, the discriminator sees a more diverse set of generated samples at each update step.

The pool operates with a simple stochastic policy:

  • If the pool is not yet full, the current image is added to the pool and returned as-is
  • If the pool is full, with 50% probability a random image from the pool is swapped with the current image and the historical image is returned; otherwise the current image is returned directly

This means the discriminator is sometimes trained on older generated images rather than always seeing the latest output of the generator. This smooths the discriminator's loss landscape and prevents it from overfitting to the generator's current mode of output.

Usage

The image pool is used only by CycleGAN, which operates in an unpaired setting where training is more prone to instability. In pix2pix, the pool size is set to 0, effectively disabling this mechanism, since the paired supervision provides sufficient training signal stability.

In CycleGAN, two separate image pools are maintained: one for fake_A images (generated by G_B) and one for fake_B images (generated by G_A). The pools are queried when computing discriminator losses.

Theoretical Basis

This technique was introduced by Shrivastava et al. in "Learning from Simulated and Unsupervised Images through Adversarial Training" (arXiv:1612.07828). The original paper proposed maintaining a history of refined images to stabilize adversarial training when learning to refine synthetic images.

The theoretical motivation draws from the concept of experience replay in reinforcement learning, where training on a buffer of past experiences rather than only the most recent transition reduces correlation between consecutive updates and stabilizes learning.

Without an image pool, the discriminator can quickly overfit to the generator's current output distribution, providing poor gradient signal. The pool ensures the discriminator remains calibrated against a broader distribution of generated images, providing more informative gradients to the generator.

The pool size of 50 and swap probability of 50% were empirically determined in the CycleGAN paper to provide a good balance between stability and responsiveness to the generator's improving quality.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment