Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:Junyanz Pytorch CycleGAN and pix2pix CuDNN Benchmark Scale Width

From Leeroopedia



Knowledge Sources
Domains Optimization, Debugging
Last Updated 2026-02-09 16:00 GMT

Overview

Disable cuDNN auto-tuner benchmark when using `scale_width` preprocessing, because variable input sizes cause performance degradation.

Description

PyTorch's `torch.backends.cudnn.benchmark = True` enables the cuDNN auto-tuner, which selects the fastest convolution algorithm for a given input size. However, this optimization assumes fixed input dimensions. When using `--preprocess scale_width`, images retain their original aspect ratios, producing inputs of varying spatial dimensions across the dataset. The auto-tuner must re-benchmark for each new size, causing significant overhead instead of speedup. The codebase automatically disables benchmarking when `scale_width` is detected.

Usage

This heuristic is automatically applied by the codebase. Be aware of it when using `--preprocess scale_width` or `--preprocess scale_width_and_crop`, as training may be slightly slower than with fixed-size inputs. If you use `--preprocess resize_and_crop` (default) or `--preprocess crop`, cuDNN benchmarking remains enabled for optimal performance.

The Insight (Rule of Thumb)

  • Action: Set `torch.backends.cudnn.benchmark = False` when input image sizes vary across the dataset.
  • Value: Automatically set in `BaseModel.__init__()` when `opt.preprocess == "scale_width"`.
  • Trade-off: Disabling the benchmark avoids re-tuning overhead for variable sizes but loses the auto-tuning speedup for fixed-size workloads.

Reasoning

The cuDNN auto-tuner caches the optimal algorithm for each unique (input_size, kernel_size, padding, stride) combination. When all images have the same dimensions (as with `resize_and_crop`), the algorithm is selected once and reused for the entire training run, providing a measurable speedup. With `scale_width`, each image may have different height, causing the cache to miss repeatedly and the tuner to re-run, which is slower than using a default algorithm.

Code evidence from `models/base_model.py:38-40`:

# with [scale_width], input images might have different sizes,
# which hurts the performance of cudnn.benchmark.
if opt.preprocess != "scale_width":
    torch.backends.cudnn.benchmark = True

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment