Principle:PrefectHQ Prefect Global Concurrency Limits
| Metadata | |
|---|---|
| Source | Repo: Prefect |
| Source | Doc: Prefect GCL |
| Domains | Concurrency, Orchestration |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
A server-coordinated mechanism for limiting concurrent access to shared resources across distributed workers and flow runs using named lease-based slots.
Description
Global Concurrency Limits (GCLs) solve the problem of multiple flow runs competing for limited resources (GPUs, software licenses, database connections) on a single worker machine. GCLs are coordinated by the Prefect server using a lease-based model. Each worker creates named limits (e.g., gpu:worker-1) with a maximum slot count. Tasks acquire slots using the concurrency() context manager and release them when done.
Because GCLs are server-coordinated, they work across separate subprocess flow runs on the same worker. Key design:
- Worker-specific names — each machine has independent limits
- Selective application — only resource-bound tasks acquire limits
- Lease-based — slots are automatically released on timeout/crash
Usage
Use GCLs when multiple concurrent flow runs on a single worker need to share a limited local resource (GPU, software license, database). Create per-worker limits named with the worker identity to scope limits per machine.
Theoretical Basis
GCLs implement the Semaphore pattern from concurrency theory, but distributed across processes via a central server. A semaphore limits the number of concurrent accessors to a resource. The naming convention (resource:worker_id) implements per-worker scoping. The acquire-use-release lifecycle is enforced via Python context manager.