Heuristic:DataTalksClub Data engineering zoomcamp Kafka Consumer Poll Timeout
| Knowledge Sources | |
|---|---|
| Domains | Stream_Processing, Debugging |
| Last Updated | 2026-02-09 07:00 GMT |
07-streaming/python/json_example/consumer.py
Overview
Limit Kafka consumer poll timeout to 1 second to allow SIGINT (Ctrl+C) signal handling, preventing the consumer from becoming unresponsive to keyboard interrupts.
Description
When a Kafka consumer calls `poll()` with a long or infinite timeout, Python's signal handler for SIGINT (Ctrl+C) cannot be invoked because the thread is blocked inside the poll call. By limiting the poll timeout to 1.0 second, the consumer returns control to Python frequently enough for `KeyboardInterrupt` to be caught and processed cleanly, enabling graceful shutdown.
Usage
Use this heuristic when implementing Kafka consumer loops in Python. Apply a 1-second poll timeout in any `while True` consumer loop to maintain responsiveness to SIGINT signals, especially during development and debugging.
The Insight (Rule of Thumb)
- Action: Set the poll timeout to `1.0` second in the consumer loop: `self.consumer.poll(1.0)`.
- Value: 1.0 second (consistent across all consumer examples in the repository).
- Trade-off: The 1-second poll loop adds minimal latency (at most 1 second between message batches) but ensures the process can always be interrupted with Ctrl+C. Without this, killing the process may require `kill -9`.
Reasoning
Python uses a single main thread for signal handling. When `KafkaConsumer.poll()` is called with a long timeout (e.g., `poll(60000)` for 60 seconds), the C-level socket read blocks the main thread. During this block, Python cannot process SIGINT signals until the poll returns. By using a 1-second timeout, the poll returns at least once per second, giving Python an opportunity to check for pending signals and raise `KeyboardInterrupt`. This pattern is used consistently in all five consumer examples in the repository (JSON, Avro, PySpark, Redpanda).
Code Evidence
Poll timeout with SIGINT comment from `consumer.py:19-20`:
# SIGINT can't be handled when polling, limit timeout to 1 second.
message = self.consumer.poll(1.0)
Graceful shutdown pattern from `consumer.py:17-29`:
while True:
try:
# SIGINT can't be handled when polling, limit timeout to 1 second.
message = self.consumer.poll(1.0)
if message is None or message == {}:
continue
for message_key, message_value in message.items():
for msg_val in message_value:
print(msg_val.key, msg_val.value)
except KeyboardInterrupt:
break
self.consumer.close()