Principle:BerriAI Litellm API Dispatch
| Knowledge Sources | BerriAI/litellm repository |
|---|---|
| Domains | LLM Integration, API Routing, Provider Abstraction |
| Last Updated | 2026-02-15 |
Overview
API dispatch is the process of routing a unified completion request to the appropriate provider-specific handler, executing the call, and returning the result through a common interface.
Description
In a multi-provider LLM system, the dispatch layer acts as a front controller that accepts a single, standardized function call and fans it out to the correct provider implementation. API dispatch solves the problem of maintaining a uniform caller experience (one function, one signature, one return type) while supporting dozens of backend providers, each with its own SDK, authentication scheme, and request/response format.
The dispatch function is the central orchestration point: it resolves the provider, validates and transforms parameters, invokes the provider-specific handler (synchronously or asynchronously), and wraps the result in a normalized response. It also handles cross-cutting concerns such as logging, caching, retries, and callback invocation.
Usage
Apply the API dispatch pattern whenever:
- A single entry point must support multiple backend implementations (provider polymorphism).
- Callers should not need to know which provider they are targeting.
- Cross-cutting concerns (logging, retries, caching, callbacks) must be applied uniformly.
- Both synchronous and asynchronous execution paths are required for the same logical operation.
Theoretical Basis
API dispatch implements the Strategy Pattern combined with the Front Controller pattern. The core design follows these principles:
1. Unified Entry Point
A single function signature accepts all parameters across all providers. The superset of parameters is exposed; provider-specific parameters are either passed through or dropped based on configuration.
# Pseudocode: unified dispatch function
function completion(model, messages, temperature, max_tokens, stream, ...kwargs):
model, provider, api_key, api_base = resolve_provider(model)
handler = get_provider_handler(provider)
request = transform_request(provider, messages, kwargs)
response = handler.invoke(request, api_key, api_base)
return normalize_response(response)
2. Provider Handler Delegation
Once the provider is resolved, dispatch delegates to a provider-specific handler. Each handler implements the same logical contract (send request, return response) but uses the provider's native SDK or HTTP API.
# Pseudocode: provider handler selection
function get_provider_handler(provider):
if provider == "openai":
return OpenAIHandler
elif provider == "anthropic":
return AnthropicHandler
elif provider == "azure":
return AzureHandler
... # one branch per provider
else:
raise UnsupportedProviderError(provider)
3. Sync/Async Duality
The dispatch layer provides both synchronous and asynchronous versions of the same operation. The async variant either invokes a native async provider SDK or wraps the synchronous call in an async executor.
4. Cross-Cutting Concern Injection
Before and after the provider call, the dispatch function applies interceptors for:
- Logging: Recording request/response metadata for observability.
- Caching: Checking and populating a response cache.
- Callbacks: Invoking registered success or failure hooks.
- Error mapping: Catching provider exceptions and converting them to the unified exception hierarchy.
5. Stream/Non-Stream Bifurcation
If the caller requests streaming, dispatch returns a stream wrapper that yields normalized chunks. Otherwise, it returns a complete response object.