Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Junyanz Pytorch CycleGAN and pix2pix Training Visualization

From Leeroopedia


Field Value
sources pytorch-CycleGAN-and-pix2pix|https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix, Weights and Biases|https://wandb.ai
domains Training, Monitoring
last_updated 2026-02-09 16:00 GMT

Overview

A monitoring pattern that tracks GAN training health through periodic loss logging, generated image display, and experiment visualization dashboards.

Description

GAN training is notoriously unstable. The generator and discriminator are locked in a minimax game, and if one network overwhelms the other the training collapses -- mode collapse, vanishing gradients, or oscillating losses are common failure modes. Monitoring discriminator/generator loss balance, visual output quality, and learning rate progression is therefore essential for any practitioner.

This principle covers four complementary channels for monitoring GAN training:

  1. Console loss logging -- Printing formatted loss values (G_GAN, G_L1, D_real, D_fake, etc.) to standard output at a configurable frequency (print_freq), annotated with rank information for distributed training. This gives immediate, real-time feedback in the terminal.
  2. File-based loss logs -- Appending the same loss strings to a persistent loss_log.txt file under the experiment checkpoint directory. This survives terminal disconnections and provides a permanent record of the full training trajectory.
  3. HTML image galleries -- Saving generated images to disk at each display interval and constructing an auto-refreshing HTML page (index.html) that presents every epoch's output in a table with labeled columns (e.g., real_A, fake_B, rec_A). The gallery is ordered newest-first so the latest results appear at the top, and the page auto-refreshes every second during training.
  4. WandB dashboard integration -- Optionally logging both loss scalars and generated images to Weights & Biases, enabling cloud-hosted experiment comparison, loss curve overlays, and image panels across runs.

All four channels are coordinated by a single Visualizer object that is created once at the start of training and called at configured intervals within the training loop.

Usage

This principle is essential during any GAN training run. Apply it whenever you need to:

  • Monitor training progress in real time (console and WandB).
  • Compare experiments across different hyperparameter settings (WandB project dashboards).
  • Create visual records of training epochs for offline browsing (HTML galleries).
  • Debug training instabilities by inspecting the loss balance between generator and discriminator.
  • Share training results with collaborators who do not have terminal access (HTML pages, WandB links).

The principle is activated by default for all training runs. WandB integration is opt-in via the --use_wandb flag. HTML galleries are enabled by default during training and can be disabled with --no_html.

Theoretical Basis

1. Loss Tracking -- G vs D Balance

In adversarial training the generator loss (G_GAN) and discriminator losses (D_real, D_fake) should remain in rough equilibrium. A healthy training run typically shows:

  • D_real and D_fake hovering near 0.5 (for a binary cross-entropy formulation), indicating the discriminator finds it difficult to distinguish real from fake.
  • G_GAN decreasing gradually, indicating the generator is learning to fool the discriminator.

If D_real drops to 0 and D_fake rises to 1, the discriminator has won and the generator receives no useful gradients. Conversely, if G_GAN drops to 0 quickly, the discriminator may be too weak and the generator overfits to trivial solutions.

Logging these values at every print_freq iteration allows early detection of such imbalances.

2. Visual Inspection -- Generated Samples at Intervals

Loss curves alone are insufficient for evaluating GANs because low loss does not guarantee perceptual quality. Periodic visual inspection of generated samples (e.g., fake_B, rec_A for CycleGAN) catches:

  • Mode collapse (all outputs look identical).
  • Artifact patterns (checkerboard artifacts from transposed convolutions).
  • Color or structural drift over epochs.

Displaying images at every display_freq iteration creates a temporal record of the generator's learning trajectory.

3. Experiment Tracking -- Comparing Runs

When sweeping hyperparameters (learning rate, lambda weights, network depth), it is critical to compare runs side by side. WandB provides:

  • Overlaid loss curves across runs.
  • Image panels showing outputs from different configurations at the same epoch.
  • Automatic logging of the full option namespace as run configuration, enabling filtering and grouping.

4. HTML Galleries for Offline Browsing

The HTML gallery provides a self-contained, dependency-free visualization that can be opened in any browser. It is especially useful when:

  • Training on a remote server without WandB access.
  • Sharing results by copying the web/ directory.
  • Reviewing results after training completes without needing to rerun inference.

Pseudocode of the Visualization Loop

visualizer = Visualizer(opt)          # Initialize once before training
total_iters = 0

for epoch in range(start_epoch, end_epoch):
    visualizer.reset()                # Reset per-epoch saved status

    for i, data in enumerate(dataset):
        total_iters += batch_size

        model.set_input(data)
        model.optimize_parameters()

        # --- Display images at display_freq intervals ---
        if total_iters % display_freq == 0:
            save_result = (total_iters % update_html_freq == 0)
            model.compute_visuals()
            visualizer.display_current_results(
                visuals=model.get_current_visuals(),   # OrderedDict of tensors
                epoch=epoch,
                total_iters=total_iters,
                save_result=save_result
            )

        # --- Print and plot losses at print_freq intervals ---
        if total_iters % print_freq == 0:
            losses = model.get_current_losses()        # OrderedDict of floats
            visualizer.print_current_losses(epoch, epoch_iter, losses, t_comp, t_data)
            visualizer.plot_current_losses(total_iters, losses)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment