Heuristic:Junyanz Pytorch CycleGAN and pix2pix Batch Size One Default
| Knowledge Sources | |
|---|---|
| Domains | Optimization, GAN_Training |
| Last Updated | 2026-02-09 16:00 GMT |
Overview
Default batch size of 1 is used for all CycleGAN and pix2pix experiments; increasing batch size can affect results even with instance normalization.
Description
The original CycleGAN and pix2pix papers use a batch size of 1 for all experiments. This is reflected in the codebase default (`--batch_size 1`). While larger batch sizes are possible when GPU memory permits, they can lead to different training dynamics even when using instance normalization. The authors note that increasing `--crop_size` may be a better alternative to increasing batch size for utilizing additional GPU memory. At test time, batch size is hard-coded to 1.
Usage
Consider this heuristic when configuring training. If you have extra GPU memory, prefer increasing --crop_size over --batch_size. If you do increase batch size for multi-GPU training, use instance normalization (`--norm instance`) or synchronized batch normalization (`--norm syncbatch`) rather than standard batch normalization.
The Insight (Rule of Thumb)
- Action: Keep `--batch_size 1` (default) for single-GPU training.
- Value: `batch_size = 1` for training; hard-coded `batch_size = 1` for testing.
- Trade-off: Larger batch sizes can improve GPU utilization but may alter training dynamics and produce different results. Standard batchnorm does not work well with multi-GPU training.
- Alternative: Increase `--crop_size` instead of batch size to use available GPU memory more effectively.
Reasoning
GANs are sensitive to batch statistics. With batch normalization, the mean and variance computed over a single image differ significantly from those computed over a batch, affecting the discriminator and generator dynamics. Instance normalization (the CycleGAN default) normalizes per-instance, so it is less affected by batch size changes. However, the authors still observe that different batch sizes can lead to different results even with instance normalization. The test script hard-codes `batch_size = 1` to ensure deterministic, reproducible inference.
From `docs/tips.md`:
"For all experiments in the paper, we set the batch size to be 1. If there is room for memory, you can use higher batch size with batch norm or instance norm. (Note that the default batchnorm does not work well with multi-GPU training.) But please be aware that it can impact the training. In particular, even with Instance Normalization, different batch sizes can lead to different results. Moreover, increasing --crop_size may be a good alternative to increasing the batch size."
Code evidence from `options/base_options.py:44`:
parser.add_argument("--batch_size", type=int, default=1,
help="input batch size")
Code evidence from `test.py:50`:
opt.batch_size = 1 # test code only supports batch_size = 1
Related Pages
- Implementation:Junyanz_Pytorch_CycleGAN_and_pix2pix_TrainOptions_Parse
- Implementation:Junyanz_Pytorch_CycleGAN_and_pix2pix_CycleGANModel_Optimize_Parameters
- Implementation:Junyanz_Pytorch_CycleGAN_and_pix2pix_Pix2PixModel_Optimize_Parameters
- Principle:Junyanz_Pytorch_CycleGAN_and_pix2pix_Training_Options_Configuration