Heuristic:Langgenius Dify Celery Queue Separation
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Infrastructure |
| Last Updated | 2026-02-12 08:00 GMT |
Overview
Task queue architecture tip: separate Celery queues by task priority and type (dataset, workflow, mail, etc.) to prevent low-priority tasks from starving critical operations.
Description
Dify's Celery worker processes 20+ distinct task queues covering API token management, dataset indexing, pipeline processing, mail delivery, workflow execution, plugin operations, and more. The Community edition (self-hosted) uses a single set of queues, while the Cloud edition splits workflow queues further (professional, team, sandbox tiers) for multi-tenant isolation.
For Kubernetes deployments, dedicated workers can be assigned to specific queues using the `CELERY_WORKER_QUEUES` environment variable, enabling horizontal scaling of bottleneck queues independently.
Usage
Apply this heuristic when deploying Dify at scale, diagnosing task processing delays, or setting up Kubernetes-based deployments where different task types need independent scaling.
The Insight (Rule of Thumb)
- Action: Use `CELERY_WORKER_QUEUES` to assign dedicated workers to specific queues in Kubernetes deployments. Use `CELERY_AUTO_SCALE=true` with `CELERY_MAX_WORKERS` and `CELERY_MIN_WORKERS` for dynamic scaling.
- Value: Community edition default queues: `api_token,dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention,workflow_based_app_execution`
- Trade-off: More workers consume more memory. Each dedicated queue worker needs at least 1 process. Over-splitting reduces throughput for bursty workloads.
Key configuration variables:
- `CELERY_WORKER_QUEUES`: Comma-separated list of queues (overrides default)
- `CELERY_WORKER_CONCURRENCY`: Number of worker processes per container
- `CELERY_WORKER_POOL`: Pool implementation (gevent, prefork, solo)
- `CELERY_PREFETCH_MULTIPLIER`: Tasks prefetched per worker (default: 1)
- `MAX_TASKS_PER_CHILD`: Tasks before worker restart (default: 50, prevents memory leaks)
Reasoning
Without queue separation, a burst of dataset indexing tasks (which are CPU and I/O intensive) could block mail delivery or workflow execution tasks from being processed. Priority queues (`priority_dataset`, `priority_pipeline`) ensure user-initiated operations take precedence over background batch operations.
The `MAX_TASKS_PER_CHILD=50` setting restarts worker processes after 50 tasks to prevent memory leaks from long-running ML operations (embedding generation, document parsing).
Code evidence from `api/docker/entrypoint.sh:34-45`:
# Configure queues based on edition if not explicitly set
if [[ -z "${CELERY_QUEUES}" ]]; then
if [[ "${EDITION}" == "CLOUD" ]]; then
DEFAULT_QUEUES="api_token,dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow_professional,workflow_team,workflow_sandbox,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention,workflow_based_app_execution"
else
DEFAULT_QUEUES="api_token,dataset,priority_dataset,priority_pipeline,pipeline,mail,ops_trace,app_deletion,plugin,workflow_storage,conversation,workflow,schedule_poller,schedule_executor,triggered_workflow_dispatcher,trigger_refresh_executor,retention,workflow_based_app_execution"
fi
fi
Kubernetes-specific queue override from `api/docker/entrypoint.sh:47-56`:
# Support for Kubernetes deployment with specific queue workers
# Environment variables that can be set:
# - CELERY_WORKER_QUEUES: Comma-separated list of queues (overrides CELERY_QUEUES)
# - CELERY_WORKER_CONCURRENCY: Number of worker processes (overrides CELERY_WORKER_AMOUNT)
# - CELERY_WORKER_POOL: Pool implementation (overrides CELERY_WORKER_CLASS)
if [[ -n "${CELERY_WORKER_QUEUES}" ]]; then
DEFAULT_QUEUES="${CELERY_WORKER_QUEUES}"
fi