Implementation:BerriAI Litellm Model Prices Database

Overview

Description

The model_prices_and_context_window.json file is the authoritative source of truth for model pricing, context window sizes, and feature capabilities across all 100+ LLM providers supported by LiteLLM. Located at the repository root, this file is hosted on GitHub and fetched remotely at runtime by LiteLLM clients to ensure they always have the latest pricing data.

The file contains thousands of model entries covering providers such as OpenAI, Anthropic, Google (Vertex AI/Gemini), AWS Bedrock, Azure, Cohere, AI21, and many more. Each entry defines per-token costs, maximum token limits, operational mode (chat, embedding, image generation, etc.), and boolean capability flags.

This is the primary file that is automatically updated via a GitHub Actions workflow (.github/workflows/auto_update_price_and_context_window_file.py). A backup copy is maintained at litellm/model_prices_and_context_window_backup.json and kept in sync via CI checks.

Usage

At LiteLLM startup, the get_model_cost_map() function fetches this file from GitHub:

# Default remote URL:
https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

The fetched data is used for:

Cost calculation -- Computing per-request costs based on token usage
Model validation -- Checking context window limits before sending requests
Feature detection -- Determining if a model supports function calling, vision, audio, etc.
Provider routing -- Mapping model names to their correct provider implementations

Data Schema

Top-Level Structure

The file is a single JSON object. Each key is a model identifier and each value is a model specification object.

{
    "sample_spec": { ... },                    // Schema documentation entry
    "gpt-4": {                                 // Model identifier as key
        "litellm_provider": "openai",
        "input_cost_per_token": 0.00003,
        "output_cost_per_token": 0.00006,
        "max_input_tokens": 8192,
        "max_output_tokens": 8192,
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": false,
        ...
    },
    "1024-x-1024/dall-e-2": {                  // Image models use resolution-prefixed keys
        "litellm_provider": "openai",
        "mode": "image_generation",
        "input_cost_per_pixel": 1.9e-08,
        ...
    },
    ...
}

Model Key Naming Conventions

Pattern	Example	Description
`{model_name}`	`"gpt-4"`	Standard model name
`{provider}/{model_name}`	`"bedrock/amazon.nova-canvas-v1:0"`	Provider-prefixed model
`{resolution}/{model_name}`	`"1024-x-1024/dall-e-2"`	Resolution-prefixed image model
`{resolution}/{steps}/{model}`	`"512-x-512/50-steps/stability.stable-diffusion-xl-v0"`	Resolution and step count for diffusion models

Schema Fields

Core Identification

Field	Type	Required	Description
`litellm_provider`	String	Yes	Provider slug matching LiteLLM's provider registry (e.g., `"openai"`, `"anthropic"`, `"bedrock"`, `"vertex_ai"`)
`mode`	String	Yes	Operational mode. One of: `"chat"`, `"embedding"`, `"completion"`, `"image_generation"`, `"audio_transcription"`, `"audio_speech"`, `"moderation"`, `"rerank"`, `"search"`

Token Limits

Field	Type	Description
`max_input_tokens`	Integer	Maximum number of input tokens the model accepts
`max_output_tokens`	Integer	Maximum number of output tokens the model can generate
`max_tokens`	Integer	LEGACY field. Set to `max_output_tokens` if provider specifies it, otherwise `max_input_tokens`

Cost Per Token

Field	Type	Description
`input_cost_per_token`	Float	USD cost per input token
`output_cost_per_token`	Float	USD cost per output token
`output_cost_per_reasoning_token`	Float	USD cost per reasoning/thinking token (for models like o1, o3)
`input_cost_per_audio_token`	Float	USD cost per audio input token

Cost Per Unit (Non-Token Models)

Field	Type	Description
`output_cost_per_image`	Float	USD cost per generated image
`input_cost_per_pixel`	Float	USD cost per input pixel (image models)
`output_cost_per_pixel`	Float	USD cost per output pixel

Platform Feature Costs

Field	Type	Description
`code_interpreter_cost_per_session`	Float	Cost per code interpreter session
`file_search_cost_per_1k_calls`	Float	Cost per 1,000 file search API calls
`file_search_cost_per_gb_per_day`	Float	File search storage cost per GB per day
`vector_store_cost_per_gb_per_day`	Float	Vector store storage cost per GB per day
`computer_use_input_cost_per_1k_tokens`	Float	Cost per 1K tokens for computer use input
`computer_use_output_cost_per_1k_tokens`	Float	Cost per 1K tokens for computer use output

Search Context Costs

Field	Type	Description
`search_context_cost_per_query`	Object	Tiered pricing for search context
`.search_context_size_low`	Float	Cost for low-sized search context
`.search_context_size_medium`	Float	Cost for medium-sized search context
`.search_context_size_high`	Float	Cost for high-sized search context

Capability Flags

Field	Type	Description
`supports_function_calling`	Boolean	Model supports function/tool calling
`supports_parallel_function_calling`	Boolean	Model supports parallel function execution
`supports_vision`	Boolean	Model accepts image inputs
`supports_audio_input`	Boolean	Model accepts audio inputs
`supports_audio_output`	Boolean	Model produces audio outputs
`supports_prompt_caching`	Boolean	Model supports prompt prefix caching
`supports_reasoning`	Boolean	Model supports extended reasoning/thinking
`supports_response_schema`	Boolean	Model supports structured JSON response schemas
`supports_system_messages`	Boolean	Model supports system-level messages
`supports_web_search`	Boolean	Model supports integrated web search

Metadata Fields

Field	Type	Description
`deprecation_date`	String	Date model becomes deprecated (`YYYY-MM-DD` format)
`supported_regions`	Array[String]	Regions where the model is available (e.g., `"us-west-2"`, `"eu-west-1"`, `"global"`)

Usage Examples

Remote Fetch at Startup

# From litellm/litellm_core_utils/get_model_cost_map.py:
def get_model_cost_map(url: str) -> dict:
    """
    1. If LITELLM_LOCAL_MODEL_COST_MAP is set, uses local backup only.
    2. Otherwise fetches from url, validates integrity, and falls back
       to local backup on any failure.
    """
    if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
        return GetModelCostMap.load_local_model_cost_map()

    try:
        content = GetModelCostMap.fetch_remote_model_cost_map(url)
    except Exception:
        return GetModelCostMap.load_local_model_cost_map()

    if not GetModelCostMap.validate_model_cost_map(
        fetched_map=content,
        backup_model_count=GetModelCostMap._get_backup_model_count(),
    ):
        return GetModelCostMap.load_local_model_cost_map()

    return content

Registering a Custom Model Cost URL

Users can point LiteLLM to a custom model cost file:

# From tests/local_testing/test_register_model.py:
litellm.register_model(
    model_cost="https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
)

Validation in Unit Tests

# From tests/test_litellm/test_utils.py:
def test_aaamodel_prices_and_context_window_json_is_valid():
    """Validates the model_prices_and_context_window.json file."""
    prod_json = "./model_prices_and_context_window.json"
    # Validates JSON structure, field types, required fields, etc.

Auto-Update Workflow

A GitHub Actions workflow automatically updates the file with latest pricing from providers:

# From .github/workflows/auto_update_price_and_context_window_file.py:
local_file_path = "model_prices_and_context_window.json"
# Fetches latest pricing data and updates the file

Related Pages

Model Prices Backup - Bundled backup copy included in the Python package
CircleCI Config - CI pipeline that validates both copies stay in sync
Provider Endpoints Support - Provider capability matrix (complementary data)

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment