Principle:BerriAI Litellm API Key Management

Knowledge Sources	Domains	Last Updated
BerriAI/litellm repository	Authentication, Access Control, Budget Management	2026-02-15

Overview

Managing API access credentials with scoped permissions, budget limits, and team associations for an LLM proxy gateway.

Description

API key management is the discipline of creating, distributing, validating, and revoking access credentials that control who can use an LLM proxy and under what constraints. In a multi-tenant LLM gateway, raw provider API keys (e.g., OpenAI, Anthropic) are held by the proxy, and virtual API keys are issued to end users and applications. These virtual keys provide a layer of indirection that enables:

Access control -- Restricting which models, endpoints, and features a key can access.
Budget enforcement -- Setting maximum spend limits per key, with optional budget reset periods (daily, weekly, monthly).
Rate limiting -- Constraining requests per minute (RPM) and tokens per minute (TPM) per key.
Team association -- Binding keys to teams and organizations for hierarchical access and spend tracking.
Audit and attribution -- Tracking all API usage back to specific keys, users, and teams for billing and compliance.
Expiration and rotation -- Setting key expiration dates and supporting automatic rotation at configurable intervals.
Permission scoping -- Defining fine-grained permissions such as allowed routes, guardrail configurations, and object-level access (vector stores, agents).

The key lifecycle follows a standard create-read-update-delete (CRUD) pattern, with additional operations for rotation, blocking/unblocking, and bulk management.

Usage

Use API key management when:

Operating an LLM proxy that serves multiple users, teams, or applications.
Enforcing cost controls by setting per-key budget limits and tracking spend.
Implementing role-based access control where different keys have access to different model groups.
Meeting compliance requirements for audit trails and access logging.
Distributing LLM access to external users or partners with scoped permissions.
Implementing chargeback or cost attribution across organizational units.

Theoretical Basis

API key management in an LLM gateway follows the token-based access control pattern, where each key is a bearer token that encodes (by reference) the holder's permissions and constraints.

STRUCTURE APIKey:
    token:              STRING      -- the secret key value (hashed at rest)
    key_alias:          STRING      -- human-readable name
    user_id:            STRING      -- owner of the key
    team_id:            STRING      -- team association
    organization_id:    STRING      -- organization association
    models:             LIST[STRING] -- allowed model names (empty = all)
    max_budget:         FLOAT       -- maximum spend allowed
    spend:              FLOAT       -- current accumulated spend
    budget_duration:    STRING      -- reset period ("30d", "7d", etc.)
    rpm_limit:          INTEGER     -- max requests per minute
    tpm_limit:          INTEGER     -- max tokens per minute
    expires:            DATETIME    -- key expiration time
    metadata:           MAP         -- arbitrary key-value metadata
    permissions:        MAP         -- fine-grained permission flags
    blocked:            BOOLEAN     -- whether key is actively blocked

The key generation process follows this general algorithm:

FUNCTION generate_key(request, caller_auth):
    -- Phase 1: Authorization check
    VERIFY caller has permission to create keys
    IF request.team_id IS SET THEN
        VERIFY caller is member of team
        VERIFY team allows key creation by caller's role

    -- Phase 2: Validation
    VALIDATE budget values are non-negative
    VALIDATE required parameters are present (if configured)
    IF request.key IS SET THEN
        VALIDATE key format and uniqueness
    ELSE
        key = GENERATE_RANDOM_KEY(length=16, prefix="sk-")

    -- Phase 3: Constraint resolution
    IF request.team_id IS SET THEN
        APPLY team-level model restrictions
        APPLY team-level budget constraints
    IF request.organization_id IS SET THEN
        RESOLVE organization membership

    -- Phase 4: Persistence
    hashed_token = HASH(key)
    STORE key_record IN database WITH hashed_token

    -- Phase 5: Post-creation hooks
    TRIGGER event hooks (e.g., send invite email)
    CACHE key_record for fast lookup

    RETURN {key, expires, user_id, ...metadata}

Key design principles:

Principle of least privilege -- Keys are created with the minimum necessary permissions. Empty model lists default to all models, but explicit restrictions are encouraged.
Defense in depth -- Multiple layers of validation occur: caller authorization, team membership checks, parameter validation, and custom auth hooks.
Separation of authentication and authorization -- The key authenticates the caller; subsequent checks authorize specific actions based on key metadata.
Budget isolation -- Key-level budgets are independent of team and organization budgets, creating a hierarchy of spend controls.
Hash-at-rest -- Key values are stored as cryptographic hashes; the plaintext key is only returned once at creation time.

Related Pages

Implementation:BerriAI_Litellm_Generate_Key_Fn

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment