Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Fallback Utils

From Leeroopedia
Attribute Value
Sources litellm/litellm_core_utils/fallback_utils.py
Domains Reliability, Fallback Logic, Completion
Last Updated 2026-02-15 16:00 GMT

Overview

Provides completion-with-fallbacks logic that automatically retries with alternative models when the primary model fails.

Description

This module implements a fallback chain for LLM completions. It exposes two functions:

  • async_completion_with_fallbacks -- An async function that attempts litellm.acompletion with the primary model, then sequentially tries each fallback model until one succeeds. Fallbacks can be plain model name strings or dictionaries containing model-specific overrides (e.g., custom API keys or parameters). Internal parameters are filtered out before calling the provider API. All attempts share a single litellm_call_id for tracing.
  • completion_with_fallbacks -- A synchronous wrapper that delegates to the async version via run_async_function.

If all models fail, the function raises an Exception with the most recent error message and a suggestion to enable verbose logging.

Usage

Import these functions when you need automatic model fallback behavior without using the full Router system. Useful for simple scripts and CLI tools that want resilience across multiple LLM providers.

Code Reference

Source Location

litellm/litellm_core_utils/fallback_utils.py (77 lines)

Signature

async def async_completion_with_fallbacks(**kwargs) -> ModelResponse

def completion_with_fallbacks(**kwargs) -> ModelResponse

Import

from litellm.litellm_core_utils.fallback_utils import (
    async_completion_with_fallbacks,
    completion_with_fallbacks,
)

I/O Contract

async_completion_with_fallbacks

Direction Name Type Description
Input model str (via kwargs) Primary model name
Input fallbacks List[Union[str, dict]] (via nested kwargs) List of fallback models or model config dicts
Input **kwargs keyword All other completion parameters (messages, temperature, etc.)
Output return ModelResponse The first successful completion response
Output raises Exception If all models (primary + fallbacks) fail

completion_with_fallbacks

Direction Name Type Description
Input **kwargs keyword Same as async_completion_with_fallbacks
Output return ModelResponse The first successful completion response

Usage Examples

import litellm
from litellm.litellm_core_utils.fallback_utils import completion_with_fallbacks

# Synchronous usage with string fallbacks
response = completion_with_fallbacks(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
    kwargs={"fallbacks": ["gpt-3.5-turbo", "claude-3-haiku-20240307"]},
)

# Async usage with dict fallbacks (custom config per fallback)
import asyncio
from litellm.litellm_core_utils.fallback_utils import async_completion_with_fallbacks

async def main():
    response = await async_completion_with_fallbacks(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}],
        kwargs={
            "fallbacks": [
                {"model": "gpt-3.5-turbo", "temperature": 0.5},
                "claude-3-haiku-20240307",
            ]
        },
    )
    return response

result = asyncio.run(main())

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment