Implementation:BerriAI Litellm Run Server

Knowledge Sources	Domains	Last Updated
BerriAI/litellm repository	HTTP Server, CLI, Process Management	2026-02-15

Overview

Concrete tool for starting the LiteLLM proxy server provided by the run_server function in the proxy CLI module.

Description

The run_server function is a Click-decorated CLI command that serves as the main entry point for launching the LiteLLM proxy server. It accepts a comprehensive set of command-line options covering host/port binding, model configuration, SSL settings, worker management, and operational modes. The function orchestrates the full server startup sequence: saving the worker configuration, running database migrations (if a DATABASE_URL is set), and launching the ASGI server using uvicorn, gunicorn, or hypercorn depending on the selected runtime.

Key capabilities:

Supports single-worker uvicorn (default), multi-worker gunicorn (--run_gunicorn), and HTTP/2 via hypercorn (--run_hypercorn).
Applies Prisma database migrations or db push before server start when a database is configured.
Supports SSL/TLS with custom keyfile and certfile paths.
Provides --health and --test flags for pre-startup validation.
Allows skipping server startup entirely (--skip_server_startup) for migration-only runs.

Usage

Use run_server when launching the LiteLLM proxy from the command line or programmatically from a deployment script. It is invoked via litellm --config config.yaml or directly as python -m litellm.proxy.proxy_cli.

Code Reference

Attribute	Value
Source Location	`litellm/proxy/proxy_cli.py`, function defined at line 503
Signature	`def run_server(host, port, api_base, api_version, model, alias, add_key, headers, save, debug, detailed_debug, temperature, max_tokens, request_timeout, drop_params, add_function_to_prompt, config, max_budget, telemetry, test, local, num_workers, test_async, iam_token_db_auth, num_requests, use_queue, health, version, run_gunicorn, run_hypercorn, ssl_keyfile_path, ssl_certfile_path, ciphers, log_config, use_prisma_db_push, skip_server_startup, keepalive_timeout, max_requests_before_restart)`
CLI Entry	`litellm --host 0.0.0.0 --port 4000 --config config.yaml`
Import	`from litellm.proxy.proxy_cli import run_server`

I/O Contract

Inputs

Parameter	Type	Default	Description
`host`	`str`	`"0.0.0.0"`	Host address to bind the server to.
`port`	`int`	`4000`	Port number to listen on.
`num_workers`	`int`	`1`	Number of worker processes (used with gunicorn).
`config`	`Optional[str]`	`None`	Path to the YAML proxy configuration file.
`model`	`Optional[str]`	`None`	A single model name to proxy (alternative to config file).
`debug`	`bool`	`False`	Enable debug-level logging.
`run_gunicorn`	`bool`	`False`	Start server via gunicorn instead of uvicorn.
`run_hypercorn`	`bool`	`False`	Start server via hypercorn for HTTP/2 support.
`ssl_keyfile_path`	`Optional[str]`	`None`	Path to SSL key file for HTTPS.
`ssl_certfile_path`	`Optional[str]`	`None`	Path to SSL certificate file for HTTPS.
`use_prisma_db_push`	`bool`	`False`	Use `prisma db push` instead of `prisma migrate deploy`.
`skip_server_startup`	`bool`	`False`	Skip launching the HTTP server (for migration-only runs).
`keepalive_timeout`	`Optional[int]`	`None`	Uvicorn keepalive timeout in seconds.
`max_requests_before_restart`	`Optional[int]`	`None`	Restart worker after this many requests.

Outputs

Output	Type	Description
Running HTTP server	Process	A uvicorn, gunicorn, or hypercorn process serving the FastAPI application.
Console output	stdout	Server startup logs including bound address, loaded models, and version information.

Usage Examples

Starting the proxy with a configuration file:

litellm --host 0.0.0.0 --port 4000 --config /app/config.yaml

Starting with gunicorn for production with multiple workers:

litellm --config config.yaml --port 4000 --num_workers 4 --run_gunicorn

Starting with SSL enabled:

litellm --config config.yaml --port 443 \
    --ssl_keyfile_path /etc/ssl/private/key.pem \
    --ssl_certfile_path /etc/ssl/certs/cert.pem

Running a quick single-model proxy for testing:

litellm --model gpt-4 --port 4000 --debug

Programmatic invocation (migration only, no server):

from litellm.proxy.proxy_cli import run_server

# Run config initialization and DB setup without starting the server
run_server(["--skip_server_startup"], standalone_mode=False)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment