Heuristic:Deepseek ai Janus CFG Weight Tuning
| Knowledge Sources | |
|---|---|
| Domains | Generative_Models, Optimization, Computer_Vision |
| Last Updated | 2026-02-10 09:30 GMT |
Overview
Classifier-Free Guidance weight should be set to 5 for Janus/Janus-Pro (autoregressive) and 2 for JanusFlow (rectified flow), with different CFG formulas for each architecture.
Description
Classifier-Free Guidance (CFG) is used in both Janus image generation pipelines to improve adherence to text prompts. However, the two architectures use different CFG formulas and default weights. Janus (autoregressive) uses a logit-space CFG formula while JanusFlow (rectified flow) uses a velocity-space CFG formula. Using the wrong weight for each architecture will produce either washed-out or over-saturated images. A critical bug in the original tokenizer_config.json caused CFG to not function properly, which was fixed on 2024-10-20.
Usage
Apply this heuristic when configuring image generation parameters for any Janus text-to-image workflow. The CFG weight is the single most impactful parameter for image quality. Incorrect CFG configuration (as happened with the tokenizer bug) results in "relatively poor visual generation quality" as noted in the README changelog.
The Insight (Rule of Thumb)
- Action (Janus/Janus-Pro): Set `cfg_weight=5` (range 1-10). CFG formula: `logits = logit_uncond + cfg_weight * (logit_cond - logit_uncond)`
- Action (JanusFlow): Set `cfg_weight=2` (range 1-10). CFG formula: `v = cfg_weight * v_cond - (cfg_weight - 1) * v_uncond`
- Value: Janus default = 5, JanusFlow default = 2
- Trade-off: Higher CFG weight increases prompt adherence but reduces diversity and can introduce artifacts. Lower weight produces more diverse but less prompt-faithful images.
Reasoning
The two architectures require different CFG weights because they apply guidance in fundamentally different spaces:
Janus (autoregressive): CFG operates on discrete token logits. The formula `logit_uncond + cfg_weight * (logit_cond - logit_uncond)` scales the difference between conditional and unconditional predictions. A weight of 5 provides strong guidance without distorting the logit distribution too heavily.
JanusFlow (rectified flow): CFG operates on continuous velocity fields in the ODE solver. The formula `cfg_weight * v_cond - (cfg_weight - 1) * v_uncond` directly scales the velocity vectors. Because velocity fields are more sensitive to scaling, a lower weight of 2 is optimal.
Critical historical note: A bug in `tokenizer_config.json` (fixed 2024-10-20) caused the pad_id to be incorrect, which meant unconditional tokens were not properly masked. This made CFG non-functional, resulting in poor generation quality. Ensure the tokenizer config is up to date.
Code Evidence
Janus CFG formula from `generation_inference.py:87-88`:
logits = logit_uncond + cfg_weight * (logit_cond-logit_uncond)
probs = torch.softmax(logits / temperature, dim=-1)
JanusFlow CFG formula from `demo/app_janusflow.py:130`:
v = cfg_weight * v_cond - (cfg_weight-1.) * v_uncond
Default CFG weights from `generation_inference.py:61` and `demo/app_janusflow.py:72`:
# Janus default
cfg_weight: float = 5,
# JanusFlow default
cfg_weight: float = 2.0,
CFG token masking pattern (even=conditional, odd=unconditional) from `generation_inference.py:69-73`:
tokens = torch.zeros((parallel_size*2, len(input_ids)), dtype=torch.int).cuda()
for i in range(parallel_size*2):
tokens[i, :] = input_ids
if i % 2 != 0:
tokens[i, 1:-1] = vl_chat_processor.pad_id
README bug fix note from `README.md:71`:
2024.10.20: (1) Fix a bug in tokenizer_config.json. The previous version
caused classifier-free guidance to not function properly, resulting in
relatively poor visual generation quality.
Related Pages
- Implementation:Deepseek_ai_Janus_CFG_Input_Preparation_AR
- Implementation:Deepseek_ai_Janus_AR_Token_Generation_Loop
- Implementation:Deepseek_ai_Janus_CFG_Input_Preparation_Flow
- Implementation:Deepseek_ai_Janus_ODE_Denoising_Loop
- Principle:Deepseek_ai_Janus_CFG_Input_Preparation
- Principle:Deepseek_ai_Janus_CFG_Input_Preparation_for_Flow