Principle:NVIDIA NeMo Aligner DRaFT Plus Image Alignment
| Knowledge Sources | |
|---|---|
| Domains | Multimodal, Image Generation, Diffusion Models, Alignment |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
DRaFT+ (Differentiable Reward Fine-Tuning Plus) aligns diffusion models with human preferences by backpropagating through a reward model during the denoising process, using truncated backpropagation through the final diffusion steps and KL divergence to prevent mode collapse.
Description
DRaFT+ is a method for fine-tuning text-to-image diffusion models (such as Stable Diffusion and SDXL) to better align with human aesthetic preferences. It works by:
- Differentiable reward optimization -- During training, the model generates images through its standard denoising process. The generated images are passed through a reward model (e.g., a CLIP-based model trained on human preference data), and the reward signal is backpropagated through the diffusion process to update the model parameters.
- Truncated backpropagation -- Computing gradients through the entire denoising chain (e.g., 50 DDIM steps) would be prohibitively expensive. DRaFT+ uses truncated backpropagation through only the final truncation_steps of the denoising process. Earlier steps are run with torch.no_grad(), dramatically reducing memory and compute requirements while still providing meaningful gradient signal.
- KL divergence regularization -- To prevent the fine-tuned model from collapsing to a narrow mode of high-reward images, DRaFT+ computes a Gaussian KL penalty between the noise predictions (epsilon) of the fine-tuned model and those of the original (frozen) base model. This penalty, weighted by kl_coeff, ensures the fine-tuned model stays close to the original distribution.
- Annealed sampling -- At inference time, DRaFT+ supports annealed guidance, which interpolates between the base model's score function and the fine-tuned model's score function at each denoising step using a configurable weighing function (linear, power, step, etc.). This enables smooth control over the alignment-diversity tradeoff.
The framework supports both Stable Diffusion (SD) via MegatronSDDRaFTPModel and Stable Diffusion XL (SDXL) via MegatronSDXLDRaFTPModel, as well as full fine-tuning and PEFT (LoRA) approaches.
Usage
DRaFT+ is used when:
- You want to improve the aesthetic quality of a text-to-image diffusion model according to human preferences.
- You have a trained reward model (e.g., PickScore, HPSv2) that scores image-text alignment quality.
- You want to fine-tune while preventing mode collapse via KL regularization.
- You want to control the alignment-diversity tradeoff at inference time via annealed sampling.
Theoretical Basis
Training Objective:
where is the reward model, is the diffusion generator, is the text conditioning, and is the KL coefficient.
Gaussian KL Penalty (shared variance):
where is the dimensionality of the noise prediction, is the fine-tuned model's noise prediction, and is the frozen base model's noise prediction. The sum is over the truncated denoising steps.
Annealed Sampling:
At inference step out of total steps, the combined noise prediction is:
where is a weighing function such as:
- Linear:
- Power:
- Step: