Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Langchain ai Langchain Retry Scope Best Practice

From Leeroopedia
Knowledge Sources
Domains Optimization, Error_Handling
Last Updated 2026-02-11 14:00 GMT

Overview

Apply retry logic to individual Runnables (leaf operations), not entire chains, to avoid wasting computation on non-failing steps.

Description

LangChain provides `with_retry()` on any Runnable to add automatic retry with exponential backoff and jitter. The critical insight is that retries should be scoped to the smallest possible unit — typically the network call to an LLM provider. Wrapping an entire chain in retry logic causes upstream steps (prompt formatting, input validation) to re-execute needlessly on each retry attempt.

Usage

Apply this heuristic whenever you are building a chain of Runnables that includes at least one network call (LLM invocation, embedding, API tool). This is especially important when the chain includes expensive preprocessing steps or when retry count is high.

The Insight (Rule of Thumb)

  • Action: Call `.with_retry()` on the specific Runnable that makes network calls, not on the composed chain.
  • Value: Default `max_attempt_number=3` with exponential jitter backoff (`initial=1s`).
  • Trade-off: Minimal — correct scoping reduces wasted computation. Only the failing step retries.
  • Exception Types: Default retries all exceptions. Best practice is to narrow to transient errors (5xx, 429 Too Many Requests).

Reasoning

In a chain like `template | model | parser`, only the `model` step makes a network call that can transiently fail. If you wrap the entire chain in `with_retry()`, a 429 error from the model causes the template formatting to re-execute on every retry — wasted CPU cycles. By applying retry only to `model`, the template result is preserved and only the API call is retried.

The exponential jitter backoff prevents thundering herd problems when multiple clients hit rate limits simultaneously. Jitter adds randomness to the backoff interval, spreading retry attempts across time.

Code evidence from `libs/core/langchain_core/runnables/retry.py:92-111`:

    # This logic can be used to retry any Runnable, including a chain of Runnables,
    # but in general it's best practice to keep the scope of the retry as small as
    # possible. For example, if you have a chain of Runnables, you should only retry
    # the Runnable that is likely to fail, not the entire chain.
    #
    # Example:
    #     # Good
    #     chain = template | model.with_retry()
    #
    #     # Bad
    #     chain = template | model
    #     retryable_chain = chain.with_retry()

Exception type guidance from `libs/core/langchain_core/runnables/retry.py:114-122`:

    retry_exception_types: tuple[type[BaseException], ...] = (Exception,)
    """The exception types to retry on. By default all exceptions are retried.

    In general you should only retry on exceptions that are likely to be
    transient, such as network errors.

    Good exceptions to retry are all server errors (5xx) and selected client
    errors (4xx) such as 429 Too Many Requests.
    """

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment