Implementation:BerriAI Litellm Deployment Types
Appearance
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| litellm repository | LLM Load Balancing, API Gateway Configuration | 2026-02-15 |
Overview
Concrete tool for defining model deployment configurations provided by LiteLLM, implemented as Pydantic models in the router type system.
Description
LiteLLM provides three primary types for deployment configuration:
Deployment-- A PydanticBaseModelthat bundles a logicalmodel_name, aLiteLLM_Paramsobject containing provider connection details, and aModelInfoobject with metadata. It supports dictionary-style access via__getitem__/__setitem__, and automatically propagates custom pricing fields fromlitellm_paramsintomodel_infoduring construction.
LiteLLM_Params-- ExtendsGenericLiteLLMParamswith a requiredmodelfield. Holds all provider-specific connection parameters: API keys, base URLs, API versions, timeouts, retry limits, TPM/RPM capacity, region/project credentials for Vertex AI, AWS Bedrock, and IBM WatsonX, plus deployment budget constraints.
DeploymentTypedDict-- ATypedDictalternative used when passing deployment configuration as plain dictionaries (e.g., from YAML config files or the Router constructor'smodel_listparameter).
Usage
Import these types when:
- Constructing a
model_listfor theRouterconstructor. - Programmatically adding or updating deployments via the Router API.
- Type-checking deployment configuration in custom routing logic.
Code Reference
Source Location: litellm/types/router.py, lines 165-500
Deployment Signature:
class Deployment(BaseModel):
model_name: str
litellm_params: LiteLLM_Params
model_info: ModelInfo
def __init__(
self,
model_name: str,
litellm_params: LiteLLM_Params,
model_info: Optional[Union[ModelInfo, dict]] = None,
**params,
):
LiteLLM_Params Signature:
class LiteLLM_Params(GenericLiteLLMParams):
model: str
def __init__(
self,
model: str,
custom_llm_provider: Optional[str] = None,
max_retries: Optional[Union[int, str]] = None,
tpm: Optional[int] = None,
rpm: Optional[int] = None,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
api_version: Optional[str] = None,
timeout: Optional[Union[float, str]] = None,
stream_timeout: Optional[Union[float, str]] = None,
organization: Optional[str] = None,
vertex_project: Optional[str] = None,
vertex_location: Optional[str] = None,
aws_access_key_id: Optional[str] = None,
aws_secret_access_key: Optional[str] = None,
aws_region_name: Optional[str] = None,
max_file_size_mb: Optional[float] = None,
use_in_pass_through: Optional[bool] = False,
use_litellm_proxy: Optional[bool] = False,
**params,
):
Import:
from litellm.types.router import Deployment, LiteLLM_Params, DeploymentTypedDict
I/O Contract
Deployment
| Input Parameter | Type | Required | Description |
|---|---|---|---|
| model_name | str |
Yes | Logical model alias used by callers |
| litellm_params | LiteLLM_Params |
Yes | Provider connection parameters |
| model_info | Optional[Union[ModelInfo, dict]] |
No | Metadata; auto-created if omitted |
| Output | Type | Description |
|---|---|---|
| Deployment instance | Deployment |
Validated deployment configuration with auto-generated model_info.id (UUID) |
LiteLLM_Params
| Input Parameter | Type | Required | Description |
|---|---|---|---|
| model | str |
Yes | Provider-specific model identifier (e.g., azure/gpt-4-east)
|
| api_key | Optional[str] |
No | API key for the provider |
| api_base | Optional[str] |
No | Base URL for the provider endpoint |
| api_version | Optional[str] |
No | API version (e.g., for Azure) |
| timeout | Optional[Union[float, str]] |
No | Request timeout; strings are resolved as environment variable references |
| tpm | Optional[int] |
No | Tokens-per-minute capacity for this deployment |
| rpm | Optional[int] |
No | Requests-per-minute capacity for this deployment |
| max_budget | Optional[float] |
No | Maximum spend budget for this deployment |
| budget_duration | Optional[str] |
No | Budget window (e.g., 1d, 7d)
|
Usage Examples
Creating a deployment with the Pydantic model:
from litellm.types.router import Deployment, LiteLLM_Params
deployment = Deployment(
model_name="gpt-4",
litellm_params=LiteLLM_Params(
model="azure/gpt-4-east",
api_key="sk-azure-xxx",
api_base="https://east.openai.azure.com",
api_version="2024-02-15",
tpm=100000,
rpm=600,
),
)
print(deployment.model_info.id) # auto-generated UUID
Using dictionary form for the Router constructor:
from litellm import Router
model_list = [
{
"model_name": "gpt-4",
"litellm_params": {
"model": "azure/gpt-4-east",
"api_key": "sk-azure-xxx",
"api_base": "https://east.openai.azure.com",
"api_version": "2024-02-15",
},
},
{
"model_name": "gpt-4",
"litellm_params": {
"model": "azure/gpt-4-west",
"api_key": "sk-azure-yyy",
"api_base": "https://west.openai.azure.com",
"api_version": "2024-02-15",
},
},
]
router = Router(model_list=model_list)
Deployment with custom pricing and budget:
deployment = Deployment(
model_name="claude-3",
litellm_params=LiteLLM_Params(
model="anthropic/claude-3-opus",
api_key="sk-ant-xxx",
input_cost_per_token=0.000015,
output_cost_per_token=0.000075,
max_budget=50.0,
budget_duration="1d",
),
)
# Custom pricing is automatically propagated to model_info
print(deployment.model_info.input_cost_per_token) # 0.000015
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment