Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Groq Groq python Timeout Configuration

From Leeroopedia
Knowledge Sources
Domains API_Client, Reliability
Last Updated 2026-02-15 17:00 GMT

Overview

Default timeout configuration of 60 seconds total with 5-second connect timeout, with granular per-operation override support.

Description

The Groq SDK uses a default timeout of 60 seconds for the overall request and 5 seconds for the TCP connection phase. These defaults differ from httpx's built-in 5-second default. Timeouts can be configured at the client level (affecting all requests), or overridden per-request using .with_options(timeout=...). The SDK supports granular timeout control via httpx.Timeout for separate read, write, connect, and pool timeouts. Timed-out requests are automatically retried (up to max_retries times).

Usage

Apply this heuristic when tuning client behavior for different workloads. Increase timeout for long-running generation requests (large max_tokens, complex tool use). Decrease timeout for latency-sensitive applications. Use httpx.Timeout(None) to disable timeouts entirely for batch or long-generation workloads.

The Insight (Rule of Thumb)

  • Action: Configure timeout at client initialization or per-request.
  • Value: Default is 60s total, 5s connect. Use httpx.Timeout(timeout=300, connect=10.0) for long generations.
  • Trade-off: Higher timeouts tolerate slow responses but delay failure detection. Lower timeouts fail fast but may abort valid slow requests.
  • Custom httpx client warning: Passing a raw httpx.Client as http_client= uses httpx defaults (5s timeout), not SDK defaults (60s). Use DefaultHttpxClient to preserve SDK defaults.
  • Per-request override: Use client.with_options(timeout=5.0) without changing the default for other requests.
  • Connection pool: Default limits are 100 max connections, 20 max keep-alive connections.

Reasoning

LLM inference API calls are inherently slower than traditional REST APIs due to token generation time. A 5-second default (httpx's default) would frequently timeout on legitimate requests, especially for longer completions. The 60-second default balances reliability with reasonable failure detection. The 5-second connect timeout catches network issues quickly without waiting for the full request timeout. Connection pooling (100 max, 20 keep-alive) supports concurrent request patterns while preventing resource exhaustion.

Code Evidence

Default timeout and connection limits from src/groq/_constants.py:8-11:

# default timeout is 1 minute
DEFAULT_TIMEOUT = httpx.Timeout(timeout=60, connect=5.0)
DEFAULT_MAX_RETRIES = 2
DEFAULT_CONNECTION_LIMITS = httpx.Limits(max_connections=100, max_keepalive_connections=20)

DefaultHttpxClient preserving SDK defaults from src/groq/_base_client.py:787-804:

class _DefaultHttpxClient(httpx.Client):
    def __init__(self, **kwargs: Any) -> None:
        kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
        kwargs.setdefault("limits", DEFAULT_CONNECTION_LIMITS)
        kwargs.setdefault("follow_redirects", True)
        super().__init__(**kwargs)

# At runtime:
# DefaultHttpxClient = _DefaultHttpxClient
# This is useful because overriding the `http_client` with your own instance of
# `httpx.Client` will result in httpx's defaults being used, not ours.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment