Implementation:BerriAI Litellm Fallback Utils
| Attribute | Value |
|---|---|
| Sources | litellm/litellm_core_utils/fallback_utils.py |
| Domains | Reliability, Fallback Logic, Completion |
| Last Updated | 2026-02-15 16:00 GMT |
Overview
Provides completion-with-fallbacks logic that automatically retries with alternative models when the primary model fails.
Description
This module implements a fallback chain for LLM completions. It exposes two functions:
async_completion_with_fallbacks-- An async function that attemptslitellm.acompletionwith the primary model, then sequentially tries each fallback model until one succeeds. Fallbacks can be plain model name strings or dictionaries containing model-specific overrides (e.g., custom API keys or parameters). Internal parameters are filtered out before calling the provider API. All attempts share a singlelitellm_call_idfor tracing.completion_with_fallbacks-- A synchronous wrapper that delegates to the async version viarun_async_function.
If all models fail, the function raises an Exception with the most recent error message and a suggestion to enable verbose logging.
Usage
Import these functions when you need automatic model fallback behavior without using the full Router system. Useful for simple scripts and CLI tools that want resilience across multiple LLM providers.
Code Reference
Source Location
litellm/litellm_core_utils/fallback_utils.py (77 lines)
Signature
async def async_completion_with_fallbacks(**kwargs) -> ModelResponse
def completion_with_fallbacks(**kwargs) -> ModelResponse
Import
from litellm.litellm_core_utils.fallback_utils import (
async_completion_with_fallbacks,
completion_with_fallbacks,
)
I/O Contract
async_completion_with_fallbacks
| Direction | Name | Type | Description |
|---|---|---|---|
| Input | model | str (via kwargs) |
Primary model name |
| Input | fallbacks | List[Union[str, dict]] (via nested kwargs) |
List of fallback models or model config dicts |
| Input | **kwargs | keyword | All other completion parameters (messages, temperature, etc.) |
| Output | return | ModelResponse |
The first successful completion response |
| Output | raises | Exception |
If all models (primary + fallbacks) fail |
completion_with_fallbacks
| Direction | Name | Type | Description |
|---|---|---|---|
| Input | **kwargs | keyword | Same as async_completion_with_fallbacks
|
| Output | return | ModelResponse |
The first successful completion response |
Usage Examples
import litellm
from litellm.litellm_core_utils.fallback_utils import completion_with_fallbacks
# Synchronous usage with string fallbacks
response = completion_with_fallbacks(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
kwargs={"fallbacks": ["gpt-3.5-turbo", "claude-3-haiku-20240307"]},
)
# Async usage with dict fallbacks (custom config per fallback)
import asyncio
from litellm.litellm_core_utils.fallback_utils import async_completion_with_fallbacks
async def main():
response = await async_completion_with_fallbacks(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
kwargs={
"fallbacks": [
{"model": "gpt-3.5-turbo", "temperature": 0.5},
"claude-3-haiku-20240307",
]
},
)
return response
result = asyncio.run(main())
Related Pages
- BerriAI_Litellm_Asyncify -- provides
run_async_functionused by the sync wrapper - BerriAI_Litellm_Health_Check_Helpers -- health checks that also use fallback model lists
- BerriAI_Litellm_LLM_Request_Utils -- provides
pick_cheapest_chat_models_from_llm_providerused for fallback model selection