Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:Junyanz Pytorch CycleGAN and pix2pix Batch Size One Default

From Leeroopedia





Knowledge Sources
Domains Optimization, GAN_Training
Last Updated 2026-02-09 16:00 GMT

Overview

Default batch size of 1 is used for all CycleGAN and pix2pix experiments; increasing batch size can affect results even with instance normalization.

Description

The original CycleGAN and pix2pix papers use a batch size of 1 for all experiments. This is reflected in the codebase default (`--batch_size 1`). While larger batch sizes are possible when GPU memory permits, they can lead to different training dynamics even when using instance normalization. The authors note that increasing `--crop_size` may be a better alternative to increasing batch size for utilizing additional GPU memory. At test time, batch size is hard-coded to 1.

Usage

Consider this heuristic when configuring training. If you have extra GPU memory, prefer increasing --crop_size over --batch_size. If you do increase batch size for multi-GPU training, use instance normalization (`--norm instance`) or synchronized batch normalization (`--norm syncbatch`) rather than standard batch normalization.

The Insight (Rule of Thumb)

  • Action: Keep `--batch_size 1` (default) for single-GPU training.
  • Value: `batch_size = 1` for training; hard-coded `batch_size = 1` for testing.
  • Trade-off: Larger batch sizes can improve GPU utilization but may alter training dynamics and produce different results. Standard batchnorm does not work well with multi-GPU training.
  • Alternative: Increase `--crop_size` instead of batch size to use available GPU memory more effectively.

Reasoning

GANs are sensitive to batch statistics. With batch normalization, the mean and variance computed over a single image differ significantly from those computed over a batch, affecting the discriminator and generator dynamics. Instance normalization (the CycleGAN default) normalizes per-instance, so it is less affected by batch size changes. However, the authors still observe that different batch sizes can lead to different results even with instance normalization. The test script hard-codes `batch_size = 1` to ensure deterministic, reproducible inference.

From `docs/tips.md`:

"For all experiments in the paper, we set the batch size to be 1. If there is room for memory, you can use higher batch size with batch norm or instance norm. (Note that the default batchnorm does not work well with multi-GPU training.) But please be aware that it can impact the training. In particular, even with Instance Normalization, different batch sizes can lead to different results. Moreover, increasing --crop_size may be a good alternative to increasing the batch size."

Code evidence from `options/base_options.py:44`:

parser.add_argument("--batch_size", type=int, default=1,
    help="input batch size")

Code evidence from `test.py:50`:

opt.batch_size = 1  # test code only supports batch_size = 1

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment