Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:BerriAI Litellm Error Handling

From Leeroopedia
Knowledge Sources BerriAI/litellm repository
Domains LLM Integration, Error Management, Exception Design
Last Updated 2026-02-15

Overview

Error handling in a multi-provider context is the practice of mapping diverse, provider-specific error formats into a unified exception hierarchy so that callers can write consistent recovery logic regardless of the originating provider.

Description

Each LLM provider raises its own exceptions with different status codes, error message formats, and error taxonomies. Error handling addresses this fragmentation by defining a canonical set of exception classes -- mirroring the OpenAI error hierarchy -- and implementing a mapping layer that catches any provider-specific exception and re-raises it as the corresponding canonical exception. This ensures that callers can catch RateLimitError, AuthenticationError, BadRequestError, or Timeout without knowing whether the underlying call went to OpenAI, Anthropic, Azure, Bedrock, or any other provider.

The unified exception hierarchy also enriches errors with diagnostic metadata: the originating model, the provider name, debug information, retry counts, and the original HTTP response when available.

Usage

Apply unified error handling whenever:

  • Application code must handle LLM errors uniformly across providers.
  • Retry logic needs to distinguish between retryable errors (rate limits, timeouts) and non-retryable errors (authentication, bad requests).
  • Error reporting and alerting systems require consistent error taxonomies.
  • Context window exceeded errors need special handling (e.g., message truncation or model fallback).

Theoretical Basis

Unified error handling implements the Exception Translation pattern (also known as Exception Shielding in service-oriented architecture). The core principles are:

1. Canonical Exception Hierarchy

A fixed set of exception classes maps to HTTP status codes and semantic error categories. Each canonical exception extends both the library's own hierarchy and the OpenAI SDK's exception hierarchy for maximum compatibility.

# Pseudocode: canonical exception hierarchy
BaseException (openai.APIError)
    |-- AuthenticationError (401)
    |-- PermissionDeniedError (403)
    |-- NotFoundError (404)
    |-- BadRequestError (400)
    |       |-- ContextWindowExceededError (400)
    |       |-- ContentPolicyViolationError (400)
    |       |-- RejectedRequestError (400)
    |-- UnprocessableEntityError (422)
    |-- RateLimitError (429)
    |-- InternalServerError (500)
    |-- ServiceUnavailableError (503)
    |-- Timeout (408)
    |-- APIConnectionError

2. Exception Mapping Function

A central mapping function inspects the raw exception from any provider and determines which canonical exception to raise. The mapping uses multiple signals:

# Pseudocode: exception mapping
function map_exception(original_exception, provider, model):
    status_code = extract_status_code(original_exception)
    error_message = extract_message(original_exception)

    if already a canonical exception:
        return original_exception

    if status_code == 401:
        raise AuthenticationError(error_message, provider, model)
    elif status_code == 429 or is_rate_limit_message(error_message):
        raise RateLimitError(error_message, provider, model)
    elif status_code == 400:
        if is_context_window_error(error_message):
            raise ContextWindowExceededError(error_message, provider, model)
        elif is_content_policy_error(error_message):
            raise ContentPolicyViolationError(error_message, provider, model)
        else:
            raise BadRequestError(error_message, provider, model)
    elif is_timeout(original_exception):
        raise Timeout(error_message, provider, model)
    ...

3. Enriched Error Metadata

Every canonical exception carries structured metadata beyond the error message:

  • status_code: The HTTP status code.
  • llm_provider: The originating provider name.
  • model: The model that was targeted.
  • litellm_debug_info: Additional context for debugging.
  • max_retries / num_retries: Retry state, enabling callers to implement backoff strategies.

4. String-Based Heuristics

For providers that do not return clean status codes, the mapping layer applies string-based heuristics -- scanning error messages for known patterns such as "rate limit", "context length", or "content policy violation".

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment