Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:BerriAI Litellm Model Prices Database

From Leeroopedia

Template:Implementation metadata

Overview

Description

The model_prices_and_context_window.json file is the authoritative source of truth for model pricing, context window sizes, and feature capabilities across all 100+ LLM providers supported by LiteLLM. Located at the repository root, this file is hosted on GitHub and fetched remotely at runtime by LiteLLM clients to ensure they always have the latest pricing data.

The file contains thousands of model entries covering providers such as OpenAI, Anthropic, Google (Vertex AI/Gemini), AWS Bedrock, Azure, Cohere, AI21, and many more. Each entry defines per-token costs, maximum token limits, operational mode (chat, embedding, image generation, etc.), and boolean capability flags.

This is the primary file that is automatically updated via a GitHub Actions workflow (.github/workflows/auto_update_price_and_context_window_file.py). A backup copy is maintained at litellm/model_prices_and_context_window_backup.json and kept in sync via CI checks.

Usage

At LiteLLM startup, the get_model_cost_map() function fetches this file from GitHub:

# Default remote URL:
https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json

The fetched data is used for:

  • Cost calculation -- Computing per-request costs based on token usage
  • Model validation -- Checking context window limits before sending requests
  • Feature detection -- Determining if a model supports function calling, vision, audio, etc.
  • Provider routing -- Mapping model names to their correct provider implementations

Data Schema

Top-Level Structure

The file is a single JSON object. Each key is a model identifier and each value is a model specification object.

{
    "sample_spec": { ... },                    // Schema documentation entry
    "gpt-4": {                                 // Model identifier as key
        "litellm_provider": "openai",
        "input_cost_per_token": 0.00003,
        "output_cost_per_token": 0.00006,
        "max_input_tokens": 8192,
        "max_output_tokens": 8192,
        "mode": "chat",
        "supports_function_calling": true,
        "supports_vision": false,
        ...
    },
    "1024-x-1024/dall-e-2": {                  // Image models use resolution-prefixed keys
        "litellm_provider": "openai",
        "mode": "image_generation",
        "input_cost_per_pixel": 1.9e-08,
        ...
    },
    ...
}

Model Key Naming Conventions

Pattern Example Description
{model_name} "gpt-4" Standard model name
{provider}/{model_name} "bedrock/amazon.nova-canvas-v1:0" Provider-prefixed model
{resolution}/{model_name} "1024-x-1024/dall-e-2" Resolution-prefixed image model
{resolution}/{steps}/{model} "512-x-512/50-steps/stability.stable-diffusion-xl-v0" Resolution and step count for diffusion models

Schema Fields

Core Identification

Field Type Required Description
litellm_provider String Yes Provider slug matching LiteLLM's provider registry (e.g., "openai", "anthropic", "bedrock", "vertex_ai")
mode String Yes Operational mode. One of: "chat", "embedding", "completion", "image_generation", "audio_transcription", "audio_speech", "moderation", "rerank", "search"

Token Limits

Field Type Description
max_input_tokens Integer Maximum number of input tokens the model accepts
max_output_tokens Integer Maximum number of output tokens the model can generate
max_tokens Integer LEGACY field. Set to max_output_tokens if provider specifies it, otherwise max_input_tokens

Cost Per Token

Field Type Description
input_cost_per_token Float USD cost per input token
output_cost_per_token Float USD cost per output token
output_cost_per_reasoning_token Float USD cost per reasoning/thinking token (for models like o1, o3)
input_cost_per_audio_token Float USD cost per audio input token

Cost Per Unit (Non-Token Models)

Field Type Description
output_cost_per_image Float USD cost per generated image
input_cost_per_pixel Float USD cost per input pixel (image models)
output_cost_per_pixel Float USD cost per output pixel

Platform Feature Costs

Field Type Description
code_interpreter_cost_per_session Float Cost per code interpreter session
file_search_cost_per_1k_calls Float Cost per 1,000 file search API calls
file_search_cost_per_gb_per_day Float File search storage cost per GB per day
vector_store_cost_per_gb_per_day Float Vector store storage cost per GB per day
computer_use_input_cost_per_1k_tokens Float Cost per 1K tokens for computer use input
computer_use_output_cost_per_1k_tokens Float Cost per 1K tokens for computer use output

Search Context Costs

Field Type Description
search_context_cost_per_query Object Tiered pricing for search context
.search_context_size_low Float Cost for low-sized search context
.search_context_size_medium Float Cost for medium-sized search context
.search_context_size_high Float Cost for high-sized search context

Capability Flags

Field Type Description
supports_function_calling Boolean Model supports function/tool calling
supports_parallel_function_calling Boolean Model supports parallel function execution
supports_vision Boolean Model accepts image inputs
supports_audio_input Boolean Model accepts audio inputs
supports_audio_output Boolean Model produces audio outputs
supports_prompt_caching Boolean Model supports prompt prefix caching
supports_reasoning Boolean Model supports extended reasoning/thinking
supports_response_schema Boolean Model supports structured JSON response schemas
supports_system_messages Boolean Model supports system-level messages
supports_web_search Boolean Model supports integrated web search

Metadata Fields

Field Type Description
deprecation_date String Date model becomes deprecated (YYYY-MM-DD format)
supported_regions Array[String] Regions where the model is available (e.g., "us-west-2", "eu-west-1", "global")

Usage Examples

Remote Fetch at Startup

# From litellm/litellm_core_utils/get_model_cost_map.py:
def get_model_cost_map(url: str) -> dict:
    """
    1. If LITELLM_LOCAL_MODEL_COST_MAP is set, uses local backup only.
    2. Otherwise fetches from url, validates integrity, and falls back
       to local backup on any failure.
    """
    if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
        return GetModelCostMap.load_local_model_cost_map()

    try:
        content = GetModelCostMap.fetch_remote_model_cost_map(url)
    except Exception:
        return GetModelCostMap.load_local_model_cost_map()

    if not GetModelCostMap.validate_model_cost_map(
        fetched_map=content,
        backup_model_count=GetModelCostMap._get_backup_model_count(),
    ):
        return GetModelCostMap.load_local_model_cost_map()

    return content

Registering a Custom Model Cost URL

Users can point LiteLLM to a custom model cost file:

# From tests/local_testing/test_register_model.py:
litellm.register_model(
    model_cost="https://raw.githubusercontent.com/BerriAI/litellm/main/model_prices_and_context_window.json"
)

Validation in Unit Tests

# From tests/test_litellm/test_utils.py:
def test_aaamodel_prices_and_context_window_json_is_valid():
    """Validates the model_prices_and_context_window.json file."""
    prod_json = "./model_prices_and_context_window.json"
    # Validates JSON structure, field types, required fields, etc.

Auto-Update Workflow

A GitHub Actions workflow automatically updates the file with latest pricing from providers:

# From .github/workflows/auto_update_price_and_context_window_file.py:
local_file_path = "model_prices_and_context_window.json"
# Fetches latest pricing data and updates the file

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment