Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:NVIDIA NeMo Aligner DRaFT Plus Image Alignment

From Leeroopedia


Knowledge Sources
Domains Multimodal, Image Generation, Diffusion Models, Alignment
Last Updated 2026-02-08 00:00 GMT

Overview

DRaFT+ (Differentiable Reward Fine-Tuning Plus) aligns diffusion models with human preferences by backpropagating through a reward model during the denoising process, using truncated backpropagation through the final diffusion steps and KL divergence to prevent mode collapse.

Description

DRaFT+ is a method for fine-tuning text-to-image diffusion models (such as Stable Diffusion and SDXL) to better align with human aesthetic preferences. It works by:

  1. Differentiable reward optimization -- During training, the model generates images through its standard denoising process. The generated images are passed through a reward model (e.g., a CLIP-based model trained on human preference data), and the reward signal is backpropagated through the diffusion process to update the model parameters.
  1. Truncated backpropagation -- Computing gradients through the entire denoising chain (e.g., 50 DDIM steps) would be prohibitively expensive. DRaFT+ uses truncated backpropagation through only the final truncation_steps of the denoising process. Earlier steps are run with torch.no_grad(), dramatically reducing memory and compute requirements while still providing meaningful gradient signal.
  1. KL divergence regularization -- To prevent the fine-tuned model from collapsing to a narrow mode of high-reward images, DRaFT+ computes a Gaussian KL penalty between the noise predictions (epsilon) of the fine-tuned model and those of the original (frozen) base model. This penalty, weighted by kl_coeff, ensures the fine-tuned model stays close to the original distribution.
  1. Annealed sampling -- At inference time, DRaFT+ supports annealed guidance, which interpolates between the base model's score function and the fine-tuned model's score function at each denoising step using a configurable weighing function (linear, power, step, etc.). This enables smooth control over the alignment-diversity tradeoff.

The framework supports both Stable Diffusion (SD) via MegatronSDDRaFTPModel and Stable Diffusion XL (SDXL) via MegatronSDXLDRaFTPModel, as well as full fine-tuning and PEFT (LoRA) approaches.

Usage

DRaFT+ is used when:

  • You want to improve the aesthetic quality of a text-to-image diffusion model according to human preferences.
  • You have a trained reward model (e.g., PickScore, HPSv2) that scores image-text alignment quality.
  • You want to fine-tune while preventing mode collapse via KL regularization.
  • You want to control the alignment-diversity tradeoff at inference time via annealed sampling.

Theoretical Basis

Training Objective: =𝔼xT𝒩(0,I),cD[R(Gθ(xT,c),c)]+λDKL

where R is the reward model, Gθ is the diffusion generator, c is the text conditioning, and λ is the KL coefficient.

Gaussian KL Penalty (shared variance): DKL=1dttruncϵθ(xt,t,c)ϵθ0(xt,t,c)2

where d is the dimensionality of the noise prediction, ϵθ is the fine-tuned model's noise prediction, and ϵθ0 is the frozen base model's noise prediction. The sum is over the truncated denoising steps.

Annealed Sampling:

At inference step i out of T total steps, the combined noise prediction is: ϵcombined=(1w(i,T))ϵθ0+w(i,T)ϵθ

where w(i,T) is a weighing function such as:

  • Linear: w(i,T)=i/T
  • Power: w(i,T)=(i/T)p
  • Step: w(i,T)=𝟙[i/Tτ]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment