Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:BerriAI Litellm Retry Policy Handler

From Leeroopedia
Knowledge Sources Domains Last Updated
litellm repository LLM Resilience, Fault Tolerance 2026-02-15

Overview

Concrete tool for configuring exception-specific retry logic and executing fallback chains provided by LiteLLM, implemented across the router utilities modules.

Description

LiteLLM provides two complementary mechanisms for resilient LLM API consumption:

  • get_num_retries_from_retry_policy -- A pure function that inspects the exception type and returns the configured retry count from a RetryPolicy. It supports both global retry policies and model-group-specific override policies. The function checks exception types in order: AuthenticationError, Timeout, RateLimitError, ContentPolicyViolationError, and BadRequestError.
  • run_async_fallback -- An async function that iterates through a list of fallback model groups, attempting each one via the router's async_function_with_fallbacks. It tracks fallback depth to enforce maximum fallback hops, logs success and failure events for observability, skips the original failing model group, and adds fallback headers to successful responses.
  • get_fallback_model_group -- Resolves which fallback model groups apply for a given model group by checking: exact match, stripped model group match (for versioned model names), and wildcard (*) generic fallbacks.

Usage

These utilities are called internally by the Router during retry and fallback execution. They can also be imported directly for testing or custom routing logic.

Code Reference

Source Locations:

  • litellm/router_utils/get_retry_from_policy.py (lines 19-71)
  • litellm/router_utils/fallback_event_handlers.py (lines 45-161)

get_num_retries_from_retry_policy Signature:

def get_num_retries_from_retry_policy(
    exception: Exception,
    retry_policy: Optional[Union[RetryPolicy, dict]] = None,
    model_group: Optional[str] = None,
    model_group_retry_policy: Optional[Dict[str, RetryPolicy]] = None,
) -> Optional[int]:

run_async_fallback Signature:

async def run_async_fallback(
    *args: Tuple[Any],
    litellm_router: LitellmRouter,
    fallback_model_group: List[str],
    original_model_group: str,
    original_exception: Exception,
    max_fallbacks: int,
    fallback_depth: int,
    **kwargs,
) -> Any:

get_fallback_model_group Signature:

def get_fallback_model_group(
    fallbacks: List[Any], model_group: str
) -> Tuple[Optional[List[str]], Optional[int]]:

Import:

from litellm.router_utils.get_retry_from_policy import get_num_retries_from_retry_policy
from litellm.router_utils.fallback_event_handlers import run_async_fallback, get_fallback_model_group

I/O Contract

get_num_retries_from_retry_policy

Input Parameter Type Required Description
exception Exception Yes The exception instance to match against the retry policy
retry_policy Optional[Union[RetryPolicy, dict]] No Global retry policy mapping exception types to retry counts
model_group Optional[str] No The model group name; used to look up group-specific policy
model_group_retry_policy Optional[Dict[str, RetryPolicy]] No Per-model-group retry policy overrides
Output Type Description
retry count Optional[int] Number of retries for this exception type, or None if no matching policy

run_async_fallback

Input Parameter Type Required Description
litellm_router Router Yes The router instance managing deployments
fallback_model_group List[str] Yes Ordered list of fallback model group names to try
original_model_group str Yes The model group that originally failed
original_exception Exception Yes The exception from the original failed call
max_fallbacks int Yes Maximum number of fallback hops allowed
fallback_depth int Yes Current depth in the fallback chain
Output Type Description
response Any The successful response from a fallback model group
(raises) Exception The most recent exception if all fallbacks fail, or the original exception if max depth reached

Usage Examples

Configuring a retry policy on the Router:

from litellm import Router
from litellm.types.router import RetryPolicy

router = Router(
    model_list=model_list,
    retry_policy=RetryPolicy(
        RateLimitErrorRetries=5,
        TimeoutErrorRetries=3,
        AuthenticationErrorRetries=0,
        ContentPolicyViolationErrorRetries=0,
        InternalServerErrorRetries=2,
    ),
    num_retries=2,  # default for exception types not in the policy
)

Per-model-group retry policies:

router = Router(
    model_list=model_list,
    model_group_retry_policy={
        "gpt-4": RetryPolicy(RateLimitErrorRetries=10, TimeoutErrorRetries=5),
        "gpt-3.5-turbo": RetryPolicy(RateLimitErrorRetries=3),
    },
)

Setting up fallback chains:

router = Router(
    model_list=model_list,
    fallbacks=[
        {"gpt-4": ["gpt-3.5-turbo", "claude-3-haiku"]},
    ],
    default_fallbacks=["gpt-3.5-turbo"],  # wildcard fallback for any model group
    context_window_fallbacks=[
        {"gpt-3.5-turbo": ["gpt-3.5-turbo-16k"]},
    ],
    max_fallbacks=5,  # maximum fallback depth
)

Directly using the retry policy resolver:

from litellm.router_utils.get_retry_from_policy import get_num_retries_from_retry_policy
from litellm.types.router import RetryPolicy
from litellm.exceptions import RateLimitError

policy = RetryPolicy(RateLimitErrorRetries=5, TimeoutErrorRetries=3)
exc = RateLimitError(message="Rate limit exceeded", model="gpt-4", llm_provider="openai")

retries = get_num_retries_from_retry_policy(exception=exc, retry_policy=policy)
print(retries)  # 5

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment