Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Deployment Types

From Leeroopedia
Revision as of 12:09, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/BerriAI_Litellm_Deployment_Types.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources Domains Last Updated
litellm repository LLM Load Balancing, API Gateway Configuration 2026-02-15

Overview

Concrete tool for defining model deployment configurations provided by LiteLLM, implemented as Pydantic models in the router type system.

Description

LiteLLM provides three primary types for deployment configuration:

  • Deployment -- A Pydantic BaseModel that bundles a logical model_name, a LiteLLM_Params object containing provider connection details, and a ModelInfo object with metadata. It supports dictionary-style access via __getitem__/__setitem__, and automatically propagates custom pricing fields from litellm_params into model_info during construction.
  • LiteLLM_Params -- Extends GenericLiteLLMParams with a required model field. Holds all provider-specific connection parameters: API keys, base URLs, API versions, timeouts, retry limits, TPM/RPM capacity, region/project credentials for Vertex AI, AWS Bedrock, and IBM WatsonX, plus deployment budget constraints.
  • DeploymentTypedDict -- A TypedDict alternative used when passing deployment configuration as plain dictionaries (e.g., from YAML config files or the Router constructor's model_list parameter).

Usage

Import these types when:

  • Constructing a model_list for the Router constructor.
  • Programmatically adding or updating deployments via the Router API.
  • Type-checking deployment configuration in custom routing logic.

Code Reference

Source Location: litellm/types/router.py, lines 165-500

Deployment Signature:

class Deployment(BaseModel):
    model_name: str
    litellm_params: LiteLLM_Params
    model_info: ModelInfo

    def __init__(
        self,
        model_name: str,
        litellm_params: LiteLLM_Params,
        model_info: Optional[Union[ModelInfo, dict]] = None,
        **params,
    ):

LiteLLM_Params Signature:

class LiteLLM_Params(GenericLiteLLMParams):
    model: str

    def __init__(
        self,
        model: str,
        custom_llm_provider: Optional[str] = None,
        max_retries: Optional[Union[int, str]] = None,
        tpm: Optional[int] = None,
        rpm: Optional[int] = None,
        api_key: Optional[str] = None,
        api_base: Optional[str] = None,
        api_version: Optional[str] = None,
        timeout: Optional[Union[float, str]] = None,
        stream_timeout: Optional[Union[float, str]] = None,
        organization: Optional[str] = None,
        vertex_project: Optional[str] = None,
        vertex_location: Optional[str] = None,
        aws_access_key_id: Optional[str] = None,
        aws_secret_access_key: Optional[str] = None,
        aws_region_name: Optional[str] = None,
        max_file_size_mb: Optional[float] = None,
        use_in_pass_through: Optional[bool] = False,
        use_litellm_proxy: Optional[bool] = False,
        **params,
    ):

Import:

from litellm.types.router import Deployment, LiteLLM_Params, DeploymentTypedDict

I/O Contract

Deployment

Input Parameter Type Required Description
model_name str Yes Logical model alias used by callers
litellm_params LiteLLM_Params Yes Provider connection parameters
model_info Optional[Union[ModelInfo, dict]] No Metadata; auto-created if omitted
Output Type Description
Deployment instance Deployment Validated deployment configuration with auto-generated model_info.id (UUID)

LiteLLM_Params

Input Parameter Type Required Description
model str Yes Provider-specific model identifier (e.g., azure/gpt-4-east)
api_key Optional[str] No API key for the provider
api_base Optional[str] No Base URL for the provider endpoint
api_version Optional[str] No API version (e.g., for Azure)
timeout Optional[Union[float, str]] No Request timeout; strings are resolved as environment variable references
tpm Optional[int] No Tokens-per-minute capacity for this deployment
rpm Optional[int] No Requests-per-minute capacity for this deployment
max_budget Optional[float] No Maximum spend budget for this deployment
budget_duration Optional[str] No Budget window (e.g., 1d, 7d)

Usage Examples

Creating a deployment with the Pydantic model:

from litellm.types.router import Deployment, LiteLLM_Params

deployment = Deployment(
    model_name="gpt-4",
    litellm_params=LiteLLM_Params(
        model="azure/gpt-4-east",
        api_key="sk-azure-xxx",
        api_base="https://east.openai.azure.com",
        api_version="2024-02-15",
        tpm=100000,
        rpm=600,
    ),
)
print(deployment.model_info.id)  # auto-generated UUID

Using dictionary form for the Router constructor:

from litellm import Router

model_list = [
    {
        "model_name": "gpt-4",
        "litellm_params": {
            "model": "azure/gpt-4-east",
            "api_key": "sk-azure-xxx",
            "api_base": "https://east.openai.azure.com",
            "api_version": "2024-02-15",
        },
    },
    {
        "model_name": "gpt-4",
        "litellm_params": {
            "model": "azure/gpt-4-west",
            "api_key": "sk-azure-yyy",
            "api_base": "https://west.openai.azure.com",
            "api_version": "2024-02-15",
        },
    },
]

router = Router(model_list=model_list)

Deployment with custom pricing and budget:

deployment = Deployment(
    model_name="claude-3",
    litellm_params=LiteLLM_Params(
        model="anthropic/claude-3-opus",
        api_key="sk-ant-xxx",
        input_cost_per_token=0.000015,
        output_cost_per_token=0.000075,
        max_budget=50.0,
        budget_duration="1d",
    ),
)
# Custom pricing is automatically propagated to model_info
print(deployment.model_info.input_cost_per_token)  # 0.000015

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment