Heuristic:Langchain ai Langchain Retry Scope Best Practice
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Error_Handling |
| Last Updated | 2026-02-11 14:00 GMT |
Overview
Apply retry logic to individual Runnables (leaf operations), not entire chains, to avoid wasting computation on non-failing steps.
Description
LangChain provides `with_retry()` on any Runnable to add automatic retry with exponential backoff and jitter. The critical insight is that retries should be scoped to the smallest possible unit — typically the network call to an LLM provider. Wrapping an entire chain in retry logic causes upstream steps (prompt formatting, input validation) to re-execute needlessly on each retry attempt.
Usage
Apply this heuristic whenever you are building a chain of Runnables that includes at least one network call (LLM invocation, embedding, API tool). This is especially important when the chain includes expensive preprocessing steps or when retry count is high.
The Insight (Rule of Thumb)
- Action: Call `.with_retry()` on the specific Runnable that makes network calls, not on the composed chain.
- Value: Default `max_attempt_number=3` with exponential jitter backoff (`initial=1s`).
- Trade-off: Minimal — correct scoping reduces wasted computation. Only the failing step retries.
- Exception Types: Default retries all exceptions. Best practice is to narrow to transient errors (5xx, 429 Too Many Requests).
Reasoning
In a chain like `template | model | parser`, only the `model` step makes a network call that can transiently fail. If you wrap the entire chain in `with_retry()`, a 429 error from the model causes the template formatting to re-execute on every retry — wasted CPU cycles. By applying retry only to `model`, the template result is preserved and only the API call is retried.
The exponential jitter backoff prevents thundering herd problems when multiple clients hit rate limits simultaneously. Jitter adds randomness to the backoff interval, spreading retry attempts across time.
Code evidence from `libs/core/langchain_core/runnables/retry.py:92-111`:
# This logic can be used to retry any Runnable, including a chain of Runnables,
# but in general it's best practice to keep the scope of the retry as small as
# possible. For example, if you have a chain of Runnables, you should only retry
# the Runnable that is likely to fail, not the entire chain.
#
# Example:
# # Good
# chain = template | model.with_retry()
#
# # Bad
# chain = template | model
# retryable_chain = chain.with_retry()
Exception type guidance from `libs/core/langchain_core/runnables/retry.py:114-122`:
retry_exception_types: tuple[type[BaseException], ...] = (Exception,)
"""The exception types to retry on. By default all exceptions are retried.
In general you should only retry on exceptions that are likely to be
transient, such as network errors.
Good exceptions to retry are all server errors (5xx) and selected client
errors (4xx) such as 429 Too Many Requests.
"""