Heuristic:Groq Groq python Timeout Configuration
| Knowledge Sources | |
|---|---|
| Domains | API_Client, Reliability |
| Last Updated | 2026-02-15 17:00 GMT |
Overview
Default timeout configuration of 60 seconds total with 5-second connect timeout, with granular per-operation override support.
Description
The Groq SDK uses a default timeout of 60 seconds for the overall request and 5 seconds for the TCP connection phase. These defaults differ from httpx's built-in 5-second default. Timeouts can be configured at the client level (affecting all requests), or overridden per-request using .with_options(timeout=...). The SDK supports granular timeout control via httpx.Timeout for separate read, write, connect, and pool timeouts. Timed-out requests are automatically retried (up to max_retries times).
Usage
Apply this heuristic when tuning client behavior for different workloads. Increase timeout for long-running generation requests (large max_tokens, complex tool use). Decrease timeout for latency-sensitive applications. Use httpx.Timeout(None) to disable timeouts entirely for batch or long-generation workloads.
The Insight (Rule of Thumb)
- Action: Configure timeout at client initialization or per-request.
- Value: Default is 60s total, 5s connect. Use
httpx.Timeout(timeout=300, connect=10.0)for long generations. - Trade-off: Higher timeouts tolerate slow responses but delay failure detection. Lower timeouts fail fast but may abort valid slow requests.
- Custom httpx client warning: Passing a raw
httpx.Clientashttp_client=uses httpx defaults (5s timeout), not SDK defaults (60s). UseDefaultHttpxClientto preserve SDK defaults. - Per-request override: Use
client.with_options(timeout=5.0)without changing the default for other requests. - Connection pool: Default limits are 100 max connections, 20 max keep-alive connections.
Reasoning
LLM inference API calls are inherently slower than traditional REST APIs due to token generation time. A 5-second default (httpx's default) would frequently timeout on legitimate requests, especially for longer completions. The 60-second default balances reliability with reasonable failure detection. The 5-second connect timeout catches network issues quickly without waiting for the full request timeout. Connection pooling (100 max, 20 keep-alive) supports concurrent request patterns while preventing resource exhaustion.
Code Evidence
Default timeout and connection limits from src/groq/_constants.py:8-11:
# default timeout is 1 minute
DEFAULT_TIMEOUT = httpx.Timeout(timeout=60, connect=5.0)
DEFAULT_MAX_RETRIES = 2
DEFAULT_CONNECTION_LIMITS = httpx.Limits(max_connections=100, max_keepalive_connections=20)
DefaultHttpxClient preserving SDK defaults from src/groq/_base_client.py:787-804:
class _DefaultHttpxClient(httpx.Client):
def __init__(self, **kwargs: Any) -> None:
kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
kwargs.setdefault("limits", DEFAULT_CONNECTION_LIMITS)
kwargs.setdefault("follow_redirects", True)
super().__init__(**kwargs)
# At runtime:
# DefaultHttpxClient = _DefaultHttpxClient
# This is useful because overriding the `http_client` with your own instance of
# `httpx.Client` will result in httpx's defaults being used, not ours.