Heuristic:Deepseek ai Janus CFG Weight Tuning

Knowledge Sources	Janus Janus demo default values
Domains	Generative_Models, Optimization, Computer_Vision
Last Updated	2026-02-10 09:30 GMT

Overview

Classifier-Free Guidance weight should be set to 5 for Janus/Janus-Pro (autoregressive) and 2 for JanusFlow (rectified flow), with different CFG formulas for each architecture.

Description

Classifier-Free Guidance (CFG) is used in both Janus image generation pipelines to improve adherence to text prompts. However, the two architectures use different CFG formulas and default weights. Janus (autoregressive) uses a logit-space CFG formula while JanusFlow (rectified flow) uses a velocity-space CFG formula. Using the wrong weight for each architecture will produce either washed-out or over-saturated images. A critical bug in the original tokenizer_config.json caused CFG to not function properly, which was fixed on 2024-10-20.

Usage

Apply this heuristic when configuring image generation parameters for any Janus text-to-image workflow. The CFG weight is the single most impactful parameter for image quality. Incorrect CFG configuration (as happened with the tokenizer bug) results in "relatively poor visual generation quality" as noted in the README changelog.

The Insight (Rule of Thumb)

Action (Janus/Janus-Pro): Set `cfg_weight=5` (range 1-10). CFG formula: `logits = logit_uncond + cfg_weight * (logit_cond - logit_uncond)`
Action (JanusFlow): Set `cfg_weight=2` (range 1-10). CFG formula: `v = cfg_weight * v_cond - (cfg_weight - 1) * v_uncond`
Value: Janus default = 5, JanusFlow default = 2
Trade-off: Higher CFG weight increases prompt adherence but reduces diversity and can introduce artifacts. Lower weight produces more diverse but less prompt-faithful images.

Reasoning

The two architectures require different CFG weights because they apply guidance in fundamentally different spaces:

Janus (autoregressive): CFG operates on discrete token logits. The formula `logit_uncond + cfg_weight * (logit_cond - logit_uncond)` scales the difference between conditional and unconditional predictions. A weight of 5 provides strong guidance without distorting the logit distribution too heavily.

JanusFlow (rectified flow): CFG operates on continuous velocity fields in the ODE solver. The formula `cfg_weight * v_cond - (cfg_weight - 1) * v_uncond` directly scales the velocity vectors. Because velocity fields are more sensitive to scaling, a lower weight of 2 is optimal.

Critical historical note: A bug in `tokenizer_config.json` (fixed 2024-10-20) caused the pad_id to be incorrect, which meant unconditional tokens were not properly masked. This made CFG non-functional, resulting in poor generation quality. Ensure the tokenizer config is up to date.

Code Evidence

Janus CFG formula from `generation_inference.py:87-88`:

logits = logit_uncond + cfg_weight * (logit_cond-logit_uncond)
probs = torch.softmax(logits / temperature, dim=-1)

JanusFlow CFG formula from `demo/app_janusflow.py:130`:

v = cfg_weight * v_cond - (cfg_weight-1.) * v_uncond

Default CFG weights from `generation_inference.py:61` and `demo/app_janusflow.py:72`:

# Janus default
cfg_weight: float = 5,
# JanusFlow default
cfg_weight: float = 2.0,

CFG token masking pattern (even=conditional, odd=unconditional) from `generation_inference.py:69-73`:

tokens = torch.zeros((parallel_size*2, len(input_ids)), dtype=torch.int).cuda()
for i in range(parallel_size*2):
    tokens[i, :] = input_ids
    if i % 2 != 0:
        tokens[i, 1:-1] = vl_chat_processor.pad_id

README bug fix note from `README.md:71`:

2024.10.20: (1) Fix a bug in tokenizer_config.json. The previous version
caused classifier-free guidance to not function properly, resulting in
relatively poor visual generation quality.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment