Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:BerriAI Litellm Proxy Configuration

From Leeroopedia
Knowledge Sources Domains Last Updated
BerriAI/litellm repository LLM Gateway, Configuration Management, Proxy Infrastructure 2026-02-15

Overview

Declarative configuration of an LLM proxy gateway through structured YAML files defining model deployments, runtime settings, and feature toggles.

Description

Proxy configuration is the practice of defining the complete operational behavior of an LLM gateway server through a single, declarative configuration file rather than through imperative code changes or runtime API calls. This approach addresses the problem of managing complex, multi-provider LLM deployments where dozens of models from different providers must be unified behind a single API surface.

A proxy configuration file serves as the single source of truth for:

  • Model deployments -- which LLM providers and models are available, including their API endpoints, credentials, and deployment-specific parameters.
  • General settings -- server-level configuration such as master key, database URL, allowed origins, and authentication modes.
  • LiteLLM settings -- library-level behavior toggles such as caching, fallback strategies, parameter dropping, and callback integrations.
  • Router settings -- load balancing policies, retry logic, cooldown thresholds, and routing strategies across model groups.
  • Environment variables -- secrets and configuration values injected from environment variables at config load time.

The configuration file is typically written in YAML format, which provides a human-readable, version-controllable, and auditable specification. When the proxy server starts, it parses this file, resolves environment variable references, initializes the model router with the declared deployments, and applies all settings to the runtime.

Usage

Use declarative proxy configuration when:

  • Deploying an LLM gateway that must route requests across multiple providers (OpenAI, Anthropic, Azure, AWS Bedrock, etc.).
  • Maintaining reproducible and auditable infrastructure-as-code for LLM routing.
  • Enabling non-developer operators to manage model availability and settings without modifying application code.
  • Supporting environment-specific deployments (development, staging, production) through config file variations.
  • Centralizing API key management and credential injection via environment variables.

Theoretical Basis

Declarative proxy configuration follows the infrastructure-as-code paradigm, where the desired state of a system is described in a static document and a runtime engine converges the actual system state to match that specification.

The general structure of a proxy configuration can be described with the following pseudocode:

CONFIGURATION := {
    environment_variables: MAP[name -> value_or_env_ref],
    model_list: LIST[
        {
            model_name: STRING,          -- public alias for the model group
            litellm_params: {
                model: STRING,           -- provider-prefixed model identifier
                api_key: STRING_OR_REF,  -- credential (often env variable)
                api_base: STRING,        -- optional provider endpoint override
                ...provider_specific_params
            },
            model_info: {
                mode: STRING,            -- "chat", "embedding", "image_generation", etc.
                ...metadata
            }
        }
    ],
    litellm_settings: MAP[key -> value],   -- library-level toggles
    general_settings: MAP[key -> value],   -- server-level settings
    router_settings: MAP[key -> value]     -- load balancer configuration
}

The loading algorithm follows these steps:

FUNCTION load_proxy_config(config_path):
    raw_config = PARSE_YAML(config_path)

    -- Phase 1: Resolve environment variables
    FOR EACH (name, value) IN raw_config.environment_variables:
        SET_ENV(name, RESOLVE(value))

    -- Phase 2: Apply litellm settings
    FOR EACH (key, value) IN raw_config.litellm_settings:
        IF key == "cache" THEN initialize_cache(value)
        ELSE IF key == "callbacks" THEN register_callbacks(value)
        ELSE set_litellm_attribute(key, value)

    -- Phase 3: Build the model router
    router = NEW Router(
        model_list = raw_config.model_list,
        routing_strategy = raw_config.router_settings.routing_strategy,
        num_retries = raw_config.router_settings.num_retries,
        ...
    )

    -- Phase 4: Apply general settings
    FOR EACH (key, value) IN raw_config.general_settings:
        apply_server_setting(key, value)

    RETURN (router, model_list, general_settings)

Key design principles:

  • Separation of concerns -- Model definitions, runtime settings, and infrastructure settings are kept in distinct configuration sections.
  • Environment variable indirection -- Credentials are never hard-coded; they are resolved from environment variables at load time using the pattern os.environ/VARIABLE_NAME.
  • Idempotent loading -- The configuration can be reloaded at runtime without restarting the server, enabling live updates to model deployments.
  • Layered configuration -- Database-stored model definitions can overlay or override file-based definitions, supporting hybrid static/dynamic configuration models.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment