Principle:LaurentMazare Tch rs Generative Adversarial Network

Knowledge Sources	LaurentMazare_Tch_rs Generative Adversarial Nets The Relativistic Discriminator
Domains	Generative Modeling, Deep Learning
Last Updated	2026-02-08 00:00 GMT

Overview

Generative adversarial networks train two competing neural networks in a minimax game where a generator learns to produce realistic samples while a discriminator learns to distinguish real data from generated fakes.

Description

A GAN consists of two neural networks trained simultaneously in opposition:

Generator (G): Takes a random noise vector $z$ sampled from a simple prior distribution (e.g., standard normal) and transforms it into a sample intended to resemble the real data distribution. The generator never sees real data directly; it learns entirely through the discriminator's feedback.

Discriminator (D): Receives both real data samples and generated samples and outputs a probability that the input is real. It acts as a binary classifier trained to maximize classification accuracy.

The training proceeds as a minimax game: the discriminator tries to maximize its ability to distinguish real from fake, while the generator tries to minimize the discriminator's accuracy. At equilibrium, the generator produces samples indistinguishable from real data, and the discriminator outputs 0.5 for all inputs.

Relativistic GAN is a variant where the discriminator no longer estimates the absolute probability that an input is real. Instead, it estimates the probability that a real sample is more realistic than a randomly sampled fake. This modification provides gradient signal to the generator from both real and fake samples, leading to more stable training and better sample quality.

Key training challenges include mode collapse (generator producing limited variety), training instability (oscillation or divergence), and vanishing gradients (discriminator becoming too strong too early).

Usage

GANs are applied in image synthesis, image-to-image translation, super-resolution, data augmentation, style transfer, and any task requiring generation of realistic samples from a learned data distribution.

Theoretical Basis

Standard GAN Objective:

$\min_{G} \max_{D} V (D, G) = 𝔼_{x \sim p_{d a t a}} [\log D (x)] + 𝔼_{z \sim p_{z}} [\log (1 - D (G (z)))]$

Optimal Discriminator:

For a fixed generator $G$ , the optimal discriminator is:

$D^{*} (x) = \frac{p_{d a t a} (x)}{p_{d a t a} (x) + p_{g} (x)}$

where $p_{g}$ is the implicit distribution defined by the generator.

Generator Loss (non-saturating variant):

In practice, the generator is trained to maximize $\log D (G (z))$ rather than minimize $\log (1 - D (G (z)))$ to avoid vanishing gradients early in training:

$ℒ_{G} = - 𝔼_{z \sim p_{z}} [\log D (G (z))]$

Relativistic Discriminator:

The relativistic average discriminator (RaD) modifies the discriminator output:

$D_{R a} (x, \tilde{x}) = σ (C (x) - 𝔼_{\tilde{x}} [C (\tilde{x})])$

where $C (\cdot)$ is the critic (discriminator before sigmoid), $x$ is a real sample, and $\tilde{x} = G (z)$ is a generated sample.

The relativistic losses become:

$ℒ_{D} = - 𝔼_{x} [\log D_{R a} (x, \tilde{x})] - 𝔼_{\tilde{x}} [\log (1 - D_{R a} (\tilde{x}, x))]$

$ℒ_{G} = - 𝔼_{\tilde{x}} [\log D_{R a} (\tilde{x}, x)] - 𝔼_{x} [\log (1 - D_{R a} (x, \tilde{x}))]$

Convergence:

Under ideal conditions (infinite capacity, sufficient training), the global optimum of the minimax game is achieved when $p_{g} = p_{d a t a}$ , at which point the Jensen-Shannon divergence between the two distributions is zero.

Related Pages

Implementation:LaurentMazare_Tch_rs_GAN_Example

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment