Implementation:BerriAI Litellm Deployment Types

Knowledge Sources	Domains	Last Updated
litellm repository	LLM Load Balancing, API Gateway Configuration	2026-02-15

Overview

Concrete tool for defining model deployment configurations provided by LiteLLM, implemented as Pydantic models in the router type system.

Description

LiteLLM provides three primary types for deployment configuration:

Deployment -- A Pydantic BaseModel that bundles a logical model_name, a LiteLLM_Params object containing provider connection details, and a ModelInfo object with metadata. It supports dictionary-style access via __getitem__/__setitem__, and automatically propagates custom pricing fields from litellm_params into model_info during construction.

LiteLLM_Params -- Extends GenericLiteLLMParams with a required model field. Holds all provider-specific connection parameters: API keys, base URLs, API versions, timeouts, retry limits, TPM/RPM capacity, region/project credentials for Vertex AI, AWS Bedrock, and IBM WatsonX, plus deployment budget constraints.

DeploymentTypedDict -- A TypedDict alternative used when passing deployment configuration as plain dictionaries (e.g., from YAML config files or the Router constructor's model_list parameter).

Usage

Import these types when:

Constructing a model_list for the Router constructor.
Programmatically adding or updating deployments via the Router API.
Type-checking deployment configuration in custom routing logic.

Code Reference

Source Location: litellm/types/router.py, lines 165-500

Deployment Signature:

class Deployment(BaseModel):
    model_name: str
    litellm_params: LiteLLM_Params
    model_info: ModelInfo

    def __init__(
        self,
        model_name: str,
        litellm_params: LiteLLM_Params,
        model_info: Optional[Union[ModelInfo, dict]] = None,
        **params,
    ):

LiteLLM_Params Signature:

class LiteLLM_Params(GenericLiteLLMParams):
    model: str

    def __init__(
        self,
        model: str,
        custom_llm_provider: Optional[str] = None,
        max_retries: Optional[Union[int, str]] = None,
        tpm: Optional[int] = None,
        rpm: Optional[int] = None,
        api_key: Optional[str] = None,
        api_base: Optional[str] = None,
        api_version: Optional[str] = None,
        timeout: Optional[Union[float, str]] = None,
        stream_timeout: Optional[Union[float, str]] = None,
        organization: Optional[str] = None,
        vertex_project: Optional[str] = None,
        vertex_location: Optional[str] = None,
        aws_access_key_id: Optional[str] = None,
        aws_secret_access_key: Optional[str] = None,
        aws_region_name: Optional[str] = None,
        max_file_size_mb: Optional[float] = None,
        use_in_pass_through: Optional[bool] = False,
        use_litellm_proxy: Optional[bool] = False,
        **params,
    ):

Import:

from litellm.types.router import Deployment, LiteLLM_Params, DeploymentTypedDict

I/O Contract

Deployment

Input Parameter	Type	Required	Description
model_name	`str`	Yes	Logical model alias used by callers
litellm_params	`LiteLLM_Params`	Yes	Provider connection parameters
model_info	`Optional[Union[ModelInfo, dict]]`	No	Metadata; auto-created if omitted

Output	Type	Description
Deployment instance	`Deployment`	Validated deployment configuration with auto-generated model_info.id (UUID)

LiteLLM_Params

Input Parameter	Type	Required	Description
model	`str`	Yes	Provider-specific model identifier (e.g., `azure/gpt-4-east`)
api_key	`Optional[str]`	No	API key for the provider
api_base	`Optional[str]`	No	Base URL for the provider endpoint
api_version	`Optional[str]`	No	API version (e.g., for Azure)
timeout	`Optional[Union[float, str]]`	No	Request timeout; strings are resolved as environment variable references
tpm	`Optional[int]`	No	Tokens-per-minute capacity for this deployment
rpm	`Optional[int]`	No	Requests-per-minute capacity for this deployment
max_budget	`Optional[float]`	No	Maximum spend budget for this deployment
budget_duration	`Optional[str]`	No	Budget window (e.g., `1d`, `7d`)

Usage Examples

Creating a deployment with the Pydantic model:

from litellm.types.router import Deployment, LiteLLM_Params

deployment = Deployment(
    model_name="gpt-4",
    litellm_params=LiteLLM_Params(
        model="azure/gpt-4-east",
        api_key="sk-azure-xxx",
        api_base="https://east.openai.azure.com",
        api_version="2024-02-15",
        tpm=100000,
        rpm=600,
    ),
)
print(deployment.model_info.id)  # auto-generated UUID

Using dictionary form for the Router constructor:

from litellm import Router

model_list = [
    {
        "model_name": "gpt-4",
        "litellm_params": {
            "model": "azure/gpt-4-east",
            "api_key": "sk-azure-xxx",
            "api_base": "https://east.openai.azure.com",
            "api_version": "2024-02-15",
        },
    },
    {
        "model_name": "gpt-4",
        "litellm_params": {
            "model": "azure/gpt-4-west",
            "api_key": "sk-azure-yyy",
            "api_base": "https://west.openai.azure.com",
            "api_version": "2024-02-15",
        },
    },
]

router = Router(model_list=model_list)

Deployment with custom pricing and budget:

deployment = Deployment(
    model_name="claude-3",
    litellm_params=LiteLLM_Params(
        model="anthropic/claude-3-opus",
        api_key="sk-ant-xxx",
        input_cost_per_token=0.000015,
        output_cost_per_token=0.000075,
        max_budget=50.0,
        budget_duration="1d",
    ),
)
# Custom pricing is automatically propagated to model_info
print(deployment.model_info.input_cost_per_token)  # 0.000015

Related Pages

Principle:BerriAI_Litellm_Deployment_Definition

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment