Principle:BerriAI Litellm API Key Management
| Knowledge Sources | Domains | Last Updated |
|---|---|---|
| BerriAI/litellm repository | Authentication, Access Control, Budget Management | 2026-02-15 |
Overview
Managing API access credentials with scoped permissions, budget limits, and team associations for an LLM proxy gateway.
Description
API key management is the discipline of creating, distributing, validating, and revoking access credentials that control who can use an LLM proxy and under what constraints. In a multi-tenant LLM gateway, raw provider API keys (e.g., OpenAI, Anthropic) are held by the proxy, and virtual API keys are issued to end users and applications. These virtual keys provide a layer of indirection that enables:
- Access control -- Restricting which models, endpoints, and features a key can access.
- Budget enforcement -- Setting maximum spend limits per key, with optional budget reset periods (daily, weekly, monthly).
- Rate limiting -- Constraining requests per minute (RPM) and tokens per minute (TPM) per key.
- Team association -- Binding keys to teams and organizations for hierarchical access and spend tracking.
- Audit and attribution -- Tracking all API usage back to specific keys, users, and teams for billing and compliance.
- Expiration and rotation -- Setting key expiration dates and supporting automatic rotation at configurable intervals.
- Permission scoping -- Defining fine-grained permissions such as allowed routes, guardrail configurations, and object-level access (vector stores, agents).
The key lifecycle follows a standard create-read-update-delete (CRUD) pattern, with additional operations for rotation, blocking/unblocking, and bulk management.
Usage
Use API key management when:
- Operating an LLM proxy that serves multiple users, teams, or applications.
- Enforcing cost controls by setting per-key budget limits and tracking spend.
- Implementing role-based access control where different keys have access to different model groups.
- Meeting compliance requirements for audit trails and access logging.
- Distributing LLM access to external users or partners with scoped permissions.
- Implementing chargeback or cost attribution across organizational units.
Theoretical Basis
API key management in an LLM gateway follows the token-based access control pattern, where each key is a bearer token that encodes (by reference) the holder's permissions and constraints.
STRUCTURE APIKey:
token: STRING -- the secret key value (hashed at rest)
key_alias: STRING -- human-readable name
user_id: STRING -- owner of the key
team_id: STRING -- team association
organization_id: STRING -- organization association
models: LIST[STRING] -- allowed model names (empty = all)
max_budget: FLOAT -- maximum spend allowed
spend: FLOAT -- current accumulated spend
budget_duration: STRING -- reset period ("30d", "7d", etc.)
rpm_limit: INTEGER -- max requests per minute
tpm_limit: INTEGER -- max tokens per minute
expires: DATETIME -- key expiration time
metadata: MAP -- arbitrary key-value metadata
permissions: MAP -- fine-grained permission flags
blocked: BOOLEAN -- whether key is actively blocked
The key generation process follows this general algorithm:
FUNCTION generate_key(request, caller_auth):
-- Phase 1: Authorization check
VERIFY caller has permission to create keys
IF request.team_id IS SET THEN
VERIFY caller is member of team
VERIFY team allows key creation by caller's role
-- Phase 2: Validation
VALIDATE budget values are non-negative
VALIDATE required parameters are present (if configured)
IF request.key IS SET THEN
VALIDATE key format and uniqueness
ELSE
key = GENERATE_RANDOM_KEY(length=16, prefix="sk-")
-- Phase 3: Constraint resolution
IF request.team_id IS SET THEN
APPLY team-level model restrictions
APPLY team-level budget constraints
IF request.organization_id IS SET THEN
RESOLVE organization membership
-- Phase 4: Persistence
hashed_token = HASH(key)
STORE key_record IN database WITH hashed_token
-- Phase 5: Post-creation hooks
TRIGGER event hooks (e.g., send invite email)
CACHE key_record for fast lookup
RETURN {key, expires, user_id, ...metadata}
Key design principles:
- Principle of least privilege -- Keys are created with the minimum necessary permissions. Empty model lists default to all models, but explicit restrictions are encouraged.
- Defense in depth -- Multiple layers of validation occur: caller authorization, team membership checks, parameter validation, and custom auth hooks.
- Separation of authentication and authorization -- The key authenticates the caller; subsequent checks authorize specific actions based on key metadata.
- Budget isolation -- Key-level budgets are independent of team and organization budgets, creating a hierarchy of spend controls.
- Hash-at-rest -- Key values are stored as cryptographic hashes; the plaintext key is only returned once at creation time.