Principle:Deepseek ai Janus CFG Input Preparation for Flow
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Image_Generation, Guided_Generation |
| Last Updated | 2026-02-10 09:30 GMT |
Overview
A technique for constructing paired conditional and unconditional inputs for classifier-free guidance in rectified flow image generation, including attention mask construction for the ODE denoising loop.
Description
CFG input preparation for rectified flow differs from autoregressive CFG in several ways:
- Batch structure: First half of the batch is conditional, second half is unconditional (instead of interleaved even/odd rows)
- Last token removal: The final <begin_of_image> token is removed from embeddings because it will be replaced by the timestep embedding at each denoising step
- Attention mask: An explicit attention mask is constructed with 1s for conditional tokens and 0s for unconditional prompt tokens, enabling the LLM to distinguish between the two branches
- Extended mask: The attention mask accounts for the full sequence length including prompt + timestep + 576 image latent tokens
Usage
Use this principle after prompt formatting and before the noise initialization step in the JanusFlow pipeline.
Theoretical Basis
The CFG formula for velocity predictions in rectified flow:
Where w is the CFG weight (typically 2.0 for JanusFlow).
Batch structure (for parallel_size=5):
- Rows 0-4: Conditional (full prompt embeddings)
- Rows 5-9: Unconditional (pad-masked prompt embeddings)
- Attention mask: 1s for rows 0-4, 0s for prompt region of rows 5-9
Related Pages
Implemented By
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment