Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Junyanz Pytorch CycleGAN and pix2pix CycleGAN Training

From Leeroopedia


Knowledge Sources
Domains Computer_Vision, GANs, Image_Translation
Last Updated 2026-02-09 16:00 GMT

Overview

End-to-end process for training a CycleGAN model to perform unpaired image-to-image translation between two visual domains.

Description

This workflow covers the complete procedure for training a CycleGAN (Cycle-Consistent Adversarial Network) that learns bidirectional mappings between two image domains without requiring paired training examples. The model uses two generators (G_A: A to B, G_B: B to A) and two discriminators (D_A, D_B) trained with adversarial, cycle-consistency, and optional identity losses. The architecture defaults to a ResNet generator with 9 residual blocks and a 70x70 PatchGAN discriminator, trained with least-squares GAN loss (LSGAN). An image pool buffer of size 50 stabilizes discriminator training. The learning rate follows a linear decay schedule: constant for the first half of training, then linearly decaying to zero.

Usage

Execute this workflow when you have two collections of images from different visual domains (e.g., horses and zebras, summer and winter landscapes, photographs and paintings) and want to learn a translation mapping between them. The images do not need to be paired or aligned. This is suitable when you have access to a GPU with at least 4GB VRAM (single image batch) and want to train a custom domain translation model from scratch.

Execution Steps

Step 1: Environment Setup

Install the required dependencies by creating a Conda environment from the provided specification file or by installing PyTorch and its dependencies manually. The key dependencies are PyTorch (2.4+), torchvision, Pillow, visdom (for real-time visualization), and optionally wandb (for Weights & Biases logging). Verify that CUDA is available if GPU training is desired.

Key considerations:

  • Python 3.11 is the recommended version
  • Use the provided conda environment specification for reproducible setup
  • Docker images are also available as an alternative

Step 2: Dataset Acquisition

Download an unpaired image dataset using the provided download script, or organize your own custom dataset. The dataset must be structured into four subdirectories: trainA (training images from domain A), trainB (training images from domain B), testA (test images from domain A), and testB (test images from domain B). Available built-in datasets include apple2orange, horse2zebra, monet2photo, vangogh2photo, summer2winter_yosemite, and maps.

Key considerations:

  • Images in domain A and domain B do not need to correspond to each other
  • Standard datasets are downloaded from the Berkeley EECS server
  • Custom datasets must follow the trainA/trainB/testA/testB directory structure
  • The Cityscapes dataset requires separate download due to licensing

Step 3: Configure Training Options

Set the training hyperparameters through command-line arguments. The essential parameters are the data root path, experiment name, and model type (cycle_gan). Additional parameters control the generator architecture, discriminator architecture, normalization type, learning rate, number of training epochs, loss weights (lambda_A, lambda_B for cycle consistency, lambda_identity for identity loss), and image pool size.

Key considerations:

  • Default generator is resnet_9blocks with instance normalization
  • Default discriminator is a 70x70 PatchGAN (basic)
  • Default training runs 100 epochs at constant LR plus 100 epochs with linear decay
  • Identity loss (lambda_identity=0.5) helps preserve color composition for painting-to-photo tasks
  • For multi-GPU training, use torchrun and set --norm to syncbatch

Step 4: Train the Model

Launch the training script which orchestrates the full training loop. The script parses options, creates the dataset loader (with unaligned dataset mode), instantiates the CycleGAN model (two generators and two discriminators), initializes network weights, and sets up the learning rate schedulers. Each training iteration unpacks a batch, runs the forward pass through both generators, computes adversarial and cycle-consistency losses, and updates both generator and discriminator weights. Checkpoints are saved periodically.

What happens each iteration:

  • Forward pass: Generate fake images in both directions and reconstruct originals
  • Generator update: Compute GAN loss, cycle-consistency loss, and identity loss; backpropagate and update G_A and G_B
  • Discriminator update: Sample from image pool, compute real/fake classification loss; update D_A and D_B
  • Periodically display results via Visdom or WandB and save checkpoints

Step 5: Monitor Training

Track training progress through loss values printed to the console, visual results saved to HTML pages, and optional real-time dashboards. The visualizer logs generator losses (G_A, G_B), discriminator losses (D_A, D_B), cycle-consistency losses (cycle_A, cycle_B), and identity losses (idt_A, idt_B). Generated images (real, fake, and reconstructed) are saved to an HTML gallery in the checkpoints directory.

Key considerations:

  • Visdom provides a real-time web dashboard for monitoring
  • WandB integration is available with the --use_wandb flag
  • HTML result pages are saved at checkpoints/{name}/web/index.html
  • Loss balance between generator and discriminator should remain roughly equal

Step 6: Test and Evaluate

After training completes, run the test script to generate translated images on the test set. The script loads the trained model, sets evaluation mode, iterates through test images, and saves results as an HTML gallery. For CycleGAN, testing produces translations in both directions (A to B and B to A). Results include real input images, generated fake images, and reconstructed images.

Key considerations:

  • Test script automatically sets batch_size=1, no_flip, and serial_batches
  • The --model test option generates results for one direction only (more efficient)
  • Results are saved to results/{name}/{phase}_{epoch}/index.html
  • Quantitative evaluation can be done with external tools like FID or the included Cityscapes FCN evaluation

Execution Diagram

GitHub URL

Workflow Repository