Heuristic:Fede1024 Rust rdkafka Partitioner Must Not Block
| Knowledge Sources | |
|---|---|
| Domains | Messaging, Debugging |
| Last Updated | 2026-02-07 19:30 GMT |
Overview
Custom partitioner callbacks run in librdkafka's internal event thread and must not block or execute for prolonged periods, or the entire producer will deadlock.
Description
When a custom `Partitioner` trait implementation is registered via `ProducerContext::get_custom_partitioner()`, it is invoked by librdkafka's internal event thread to determine the target partition for each message. Because this callback runs in the C library's event loop, any blocking operation (network I/O, mutex contention, heavy computation) will stall all producer operations including polling and delivery callbacks. Additionally, `sticky.partitioning.linger.ms` must be set to `0` for the custom partitioner to be called for messages with null keys.
Usage
Use this heuristic when implementing a custom Partitioner trait. This is critical safety advice that is not obvious from the Rust API surface alone. If your partitioner needs external data, pre-compute and cache it rather than fetching it on every call.
The Insight (Rule of Thumb)
- Action: Implement `Partitioner::partition()` as a pure, non-blocking function. Pre-compute any needed data.
- Value: Set `sticky.partitioning.linger.ms=0` if using custom partitioner with null-key messages.
- Trade-off: Custom partitioning adds per-message overhead; use only when the default hash-based or sticky partitioner is insufficient.
- Constraint: The method may be called multiple times for the same message/key, and from any thread at any time.
Reasoning
librdkafka's event processing thread is shared across all internal operations. The partitioner callback is invoked synchronously during `send()`, which means the calling thread is blocked until the callback returns. If the callback itself blocks on I/O or a lock, it creates a cascade: delivery callbacks stop processing, the event queue fills, and eventually `QueueFull` errors occur. The constraint that the callback may be called multiple times for the same message means it must also be idempotent.
Code Evidence
Partitioner trait constraints from `src/producer/mod.rs:220-231`:
/// It may be called in any thread at any time,
/// It may be called multiple times for the same message/key.
/// MUST NOT block or execute for prolonged periods of time.
/// MUST return a value between 0 and partition_cnt-1, or the
/// special RD_KAFKA_PARTITION_UA value if partitioning could not be performed.
Sticky partitioning requirement from `src/producer/mod.rs:207`:
/// sticky.partitioning.linger.ms must be 0 to run custom partitioner for messages with null key.