Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Deepseek ai Janus CFG Input Preparation

From Leeroopedia


Knowledge Sources
Domains Image_Generation, Guided_Generation
Last Updated 2026-02-10 09:30 GMT

Overview

A technique for constructing paired conditional and unconditional input embeddings to enable classifier-free guidance during autoregressive image token generation.

Description

Classifier-Free Guidance (CFG) improves the quality and text-alignment of generated images by computing two forward passes per generation step: one conditional (with the full text prompt) and one unconditional (with the prompt replaced by padding tokens). The final logits are a weighted combination of both.

For CFG input preparation, the tokenized prompt is duplicated into paired rows: even-indexed rows contain the full conditional prompt, while odd-indexed rows have the content tokens replaced with the pad token ID (keeping only structural tokens). Both are embedded into the language model's embedding space.

Usage

Use this principle after prompt formatting and before the autoregressive token generation loop. It sets up the parallel conditional/unconditional structure needed for CFG-guided generation.

Theoretical Basis

The CFG formula for the guided logits is:

logitsguided=logitsuncond+w(logitscondlogitsuncond)

Where w is the CFG weight (typically 5.0 for autoregressive generation in Janus).

Input structure for parallel_size images:

  • Total batch size: parallel_size × 2
  • Even rows (0, 2, 4, ...): Full conditional prompt embeddings
  • Odd rows (1, 3, 5, ...): Unconditional (pad-masked) prompt embeddings

Related Pages

Implemented By

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment