Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:BerriAI Litellm Deployment Definition

From Leeroopedia
Knowledge Sources Domains Last Updated
litellm/types/router.py LLM Load Balancing, API Gateway Configuration 2026-02-15

Overview

A deployment definition is a configuration unit that maps a logical model name to a specific provider endpoint, encapsulating all connection parameters required to route requests to that endpoint.

Description

In any multi-provider LLM gateway, there is a fundamental need to decouple the logical model name that callers use from the physical endpoint that actually serves the request. A deployment definition solves this by bundling three pieces of information into a single configuration object:

  • Model Name -- The logical alias (e.g., gpt-3.5-turbo) that callers use to reference a capability rather than a specific backend.
  • LLM Parameters -- The concrete connection details: which provider model to call, API keys, base URLs, API versions, timeouts, retry limits, throughput caps (TPM/RPM), and provider-specific credentials (e.g., AWS region, Vertex project).
  • Model Info -- Metadata about the deployment such as a unique identifier, custom pricing overrides, and supported feature flags.

Multiple deployments can share the same logical model name, enabling the router to load-balance across them. Each deployment is self-contained: it knows how to reach exactly one provider endpoint with the correct credentials and configuration.

Usage

Use deployment definitions when:

  • You need to expose a single model name that fans out to multiple provider endpoints (e.g., two Azure OpenAI deployments in different regions both serving gpt-4).
  • You want to attach per-endpoint configuration such as rate limits, budgets, timeouts, or custom pricing.
  • You are building a router or proxy that must translate logical model requests into concrete provider API calls.

Theoretical Basis

The deployment definition pattern follows the Service Abstraction principle from service-oriented architecture. The caller interacts with a stable interface (the model name), while the system resolves that name to one of several concrete backends.

Pseudocode:

STRUCTURE DeploymentParams:
    model: string               // provider-specific model identifier, e.g. "azure/gpt-4-east"
    api_key: string (optional)
    api_base: string (optional)
    timeout: float (optional)
    max_retries: int (optional)
    tpm: int (optional)         // tokens-per-minute capacity
    rpm: int (optional)         // requests-per-minute capacity
    max_budget: float (optional)
    budget_duration: string (optional)
    ...provider-specific fields...

STRUCTURE ModelInfo:
    id: string                  // unique deployment identifier (auto-generated UUID)
    input_cost_per_token: float (optional)
    output_cost_per_token: float (optional)

STRUCTURE Deployment:
    model_name: string          // logical name callers use
    llm_params: DeploymentParams
    model_info: ModelInfo       // defaults created if not provided

FUNCTION create_deployment(name, params, info=None):
    IF info IS None:
        info = ModelInfo()      // generate default metadata
    // Propagate any custom pricing from params into info
    FOR EACH pricing_field IN [input_cost_per_token, output_cost_per_token, ...]:
        IF params HAS pricing_field:
            info[pricing_field] = params[pricing_field]
    RETURN Deployment(model_name=name, llm_params=params, model_info=info)

The key insight is that deployment definitions serve as the unit of routing: the router selects among deployments, not among raw API endpoints. This makes it possible to attach routing metadata (capacity, cost, health status) to each deployment independently.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment