Heuristic:Langgenius Dify Gevent Monkey Patching Order
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Debugging |
| Last Updated | 2026-02-12 08:00 GMT |
Overview
Critical startup ordering rule: gRPC and psycopg2 must be monkey-patched after gevent patches stdlib, using the `GeventDidPatchBuiltinModulesEvent` subscriber, not `post_fork`.
Description
When running the Dify backend under Gunicorn with gevent workers, third-party libraries (gRPC and psycopg2) require compatibility patching to work with gevent's cooperative multitasking. However, the order in which this patching occurs is critical. The gRPC gevent initialization (`grpc_gevent.init_gevent()`) must be called after gevent has finished patching the Python standard library. Calling it earlier (e.g., in `post_fork` or at module import time) causes deadlocks and other difficult-to-diagnose issues.
For Celery workers, the patching must happen at the very top of the entrypoint script, before any Flask or Celery imports, because the Celery worker does not use Gunicorn's lifecycle hooks.
Usage
Apply this heuristic whenever modifying the Dify backend startup sequence, adding new gevent-incompatible libraries, or debugging deadlocks in the API server or Celery workers. This is especially relevant when upgrading Gunicorn, gevent, gRPC, or psycopg2 versions.
The Insight (Rule of Thumb)
- Action (Gunicorn): Subscribe to `gevent.events.GeventDidPatchBuiltinModulesEvent` in `gunicorn.conf.py`. In the subscriber callback, call `grpc_gevent.init_gevent()` then `pscycogreen_gevent.patch_psycopg()`.
- Action (Celery): Call `grpc_gevent.init_gevent()` and `pscycogreen_gevent.patch_psycopg()` at the top of `celery_entrypoint.py`, before importing `app` or `celery`.
- Value: Boolean — either the order is correct (works) or incorrect (deadlocks).
- Trade-off: None. Correct ordering has no performance cost. Incorrect ordering causes silent deadlocks.
DO NOT:
- Use `post_fork` hook — it runs before monkey-patching.
- Use `post_init` hook — same problem.
- Import application modules at the top of `gunicorn.conf.py` — can interfere with gevent patching.
Reasoning
Gunicorn's worker lifecycle calls `post_fork` and `post_init` before applying gevent monkey patches (as of Gunicorn 23.0.0, see `gunicorn/arbiter.py:605-609`). The `grpc_gevent.init_gevent()` function replaces gRPC's threading primitives with gevent-compatible equivalents, but this replacement must happen after `gevent.monkey.patch_all()` has already patched `threading`, `socket`, etc. If gRPC patches first, it captures references to the unpatched stdlib modules, leading to deadlocks when gevent later replaces them.
The solution uses gevent's event system: `GeventDidPatchBuiltinModulesEvent` fires at exactly the right moment — after stdlib patching but before application code loads. This was discovered through debugging Dify issue #26689.
Code evidence from `api/gunicorn.conf.py:5-27`:
# WARNING: This module is loaded very early in the Gunicorn worker lifecycle,
# before gevent's monkey-patching is applied. Importing modules at the top level
# here can interfere with gevent's ability to properly patch the standard library,
# potentially causing subtle and difficult-to-diagnose bugs.
#
# For further context, see: https://github.com/langgenius/dify/issues/26689
#
# Note: The `post_fork` hook is also executed before monkey-patching,
# so moving imports there does not resolve this issue.
# NOTE(QuantumGhost): here we cannot use post_fork to patch gRPC, as
# grpc_gevent.init_gevent must be called after patching stdlib.
# Gunicorn calls `post_init` before applying monkey patch.
# Use `post_init` to setup gRPC gevent support would cause deadlock and
# some other weird issues.
Correct Gunicorn hook from `api/gunicorn.conf.py:30-45`:
def post_patch(event):
if not isinstance(event, gevent_events.GeventDidPatchBuiltinModulesEvent):
return
grpc_gevent.init_gevent()
print("gRPC patched with gevent.", flush=True)
pscycogreen_gevent.patch_psycopg()
print("psycopg2 patched with gevent.", flush=True)
gevent_events.subscribers.append(post_patch)
Celery early patching from `api/celery_entrypoint.py:1-11`:
import psycogreen.gevent as pscycogreen_gevent
from grpc.experimental import gevent as grpc_gevent
# grpc gevent
grpc_gevent.init_gevent()
print("gRPC patched with gevent.", flush=True)
pscycogreen_gevent.patch_psycopg()
print("psycopg2 patched with gevent.", flush=True)
from app import app, celery # MUST come after patching