Implementation:BerriAI Litellm Model Prices Database
Template:Implementation metadata
Overview
Description
The model_prices_and_context_window.json file is the authoritative source of truth for model pricing, context window sizes, and feature capabilities across all 100+ LLM providers supported by LiteLLM. Located at the repository root, this file is hosted on GitHub and fetched remotely at runtime by LiteLLM clients to ensure they always have the latest pricing data.
The file contains thousands of model entries covering providers such as OpenAI, Anthropic, Google (Vertex AI/Gemini), AWS Bedrock, Azure, Cohere, AI21, and many more. Each entry defines per-token costs, maximum token limits, operational mode (chat, embedding, image generation, etc.), and boolean capability flags.
This is the primary file that is automatically updated via a GitHub Actions workflow (.github/workflows/auto_update_price_and_context_window_file.py). A backup copy is maintained at litellm/model_prices_and_context_window_backup.json and kept in sync via CI checks.
Usage
At LiteLLM startup, the get_model_cost_map() function fetches this file from GitHub:
# Default remote URL: https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json
The fetched data is used for:
- Cost calculation -- Computing per-request costs based on token usage
- Model validation -- Checking context window limits before sending requests
- Feature detection -- Determining if a model supports function calling, vision, audio, etc.
- Provider routing -- Mapping model names to their correct provider implementations
Data Schema
Top-Level Structure
The file is a single JSON object. Each key is a model identifier and each value is a model specification object.
{
"sample_spec": { ... }, // Schema documentation entry
"gpt-4": { // Model identifier as key
"litellm_provider": "openai",
"input_cost_per_token": 0.00003,
"output_cost_per_token": 0.00006,
"max_input_tokens": 8192,
"max_output_tokens": 8192,
"mode": "chat",
"supports_function_calling": true,
"supports_vision": false,
...
},
"1024-x-1024/dall-e-2": { // Image models use resolution-prefixed keys
"litellm_provider": "openai",
"mode": "image_generation",
"input_cost_per_pixel": 1.9e-08,
...
},
...
}
Model Key Naming Conventions
| Pattern | Example | Description |
|---|---|---|
{model_name} |
"gpt-4" |
Standard model name |
{provider}/{model_name} |
"bedrock/amazon.nova-canvas-v1:0" |
Provider-prefixed model |
{resolution}/{model_name} |
"1024-x-1024/dall-e-2" |
Resolution-prefixed image model |
{resolution}/{steps}/{model} |
"512-x-512/50-steps/stability.stable-diffusion-xl-v0" |
Resolution and step count for diffusion models |
Schema Fields
Core Identification
| Field | Type | Required | Description |
|---|---|---|---|
litellm_provider |
String | Yes | Provider slug matching LiteLLM's provider registry (e.g., "openai", "anthropic", "bedrock", "vertex_ai")
|
mode |
String | Yes | Operational mode. One of: "chat", "embedding", "completion", "image_generation", "audio_transcription", "audio_speech", "moderation", "rerank", "search"
|
Token Limits
| Field | Type | Description |
|---|---|---|
max_input_tokens |
Integer | Maximum number of input tokens the model accepts |
max_output_tokens |
Integer | Maximum number of output tokens the model can generate |
max_tokens |
Integer | LEGACY field. Set to max_output_tokens if provider specifies it, otherwise max_input_tokens
|
Cost Per Token
| Field | Type | Description |
|---|---|---|
input_cost_per_token |
Float | USD cost per input token |
output_cost_per_token |
Float | USD cost per output token |
output_cost_per_reasoning_token |
Float | USD cost per reasoning/thinking token (for models like o1, o3) |
input_cost_per_audio_token |
Float | USD cost per audio input token |
Cost Per Unit (Non-Token Models)
| Field | Type | Description |
|---|---|---|
output_cost_per_image |
Float | USD cost per generated image |
input_cost_per_pixel |
Float | USD cost per input pixel (image models) |
output_cost_per_pixel |
Float | USD cost per output pixel |
Platform Feature Costs
| Field | Type | Description |
|---|---|---|
code_interpreter_cost_per_session |
Float | Cost per code interpreter session |
file_search_cost_per_1k_calls |
Float | Cost per 1,000 file search API calls |
file_search_cost_per_gb_per_day |
Float | File search storage cost per GB per day |
vector_store_cost_per_gb_per_day |
Float | Vector store storage cost per GB per day |
computer_use_input_cost_per_1k_tokens |
Float | Cost per 1K tokens for computer use input |
computer_use_output_cost_per_1k_tokens |
Float | Cost per 1K tokens for computer use output |
Search Context Costs
| Field | Type | Description |
|---|---|---|
search_context_cost_per_query |
Object | Tiered pricing for search context |
.search_context_size_low |
Float | Cost for low-sized search context |
.search_context_size_medium |
Float | Cost for medium-sized search context |
.search_context_size_high |
Float | Cost for high-sized search context |
Capability Flags
| Field | Type | Description |
|---|---|---|
supports_function_calling |
Boolean | Model supports function/tool calling |
supports_parallel_function_calling |
Boolean | Model supports parallel function execution |
supports_vision |
Boolean | Model accepts image inputs |
supports_audio_input |
Boolean | Model accepts audio inputs |
supports_audio_output |
Boolean | Model produces audio outputs |
supports_prompt_caching |
Boolean | Model supports prompt prefix caching |
supports_reasoning |
Boolean | Model supports extended reasoning/thinking |
supports_response_schema |
Boolean | Model supports structured JSON response schemas |
supports_system_messages |
Boolean | Model supports system-level messages |
supports_web_search |
Boolean | Model supports integrated web search |
Metadata Fields
| Field | Type | Description |
|---|---|---|
deprecation_date |
String | Date model becomes deprecated (YYYY-MM-DD format)
|
supported_regions |
Array[String] | Regions where the model is available (e.g., "us-west-2", "eu-west-1", "global")
|
Usage Examples
Remote Fetch at Startup
# From litellm/litellm_core_utils/get_model_cost_map.py:
def get_model_cost_map(url: str) -> dict:
"""
1. If LITELLM_LOCAL_MODEL_COST_MAP is set, uses local backup only.
2. Otherwise fetches from url, validates integrity, and falls back
to local backup on any failure.
"""
if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
return GetModelCostMap.load_local_model_cost_map()
try:
content = GetModelCostMap.fetch_remote_model_cost_map(url)
except Exception:
return GetModelCostMap.load_local_model_cost_map()
if not GetModelCostMap.validate_model_cost_map(
fetched_map=content,
backup_model_count=GetModelCostMap._get_backup_model_count(),
):
return GetModelCostMap.load_local_model_cost_map()
return content
Registering a Custom Model Cost URL
Users can point LiteLLM to a custom model cost file:
# From tests/local_testing/test_register_model.py:
litellm.register_model(
model_cost="https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
)
Validation in Unit Tests
# From tests/test_litellm/test_utils.py:
def test_aaamodel_prices_and_context_window_json_is_valid():
"""Validates the model_prices_and_context_window.json file."""
prod_json = "./model_prices_and_context_window.json"
# Validates JSON structure, field types, required fields, etc.
Auto-Update Workflow
A GitHub Actions workflow automatically updates the file with latest pricing from providers:
# From .github/workflows/auto_update_price_and_context_window_file.py: local_file_path = "model_prices_and_context_window.json" # Fetches latest pricing data and updates the file
Related Pages
- Model Prices Backup - Bundled backup copy included in the Python package
- CircleCI Config - CI pipeline that validates both copies stay in sync
- Provider Endpoints Support - Provider capability matrix (complementary data)