Implementation:BerriAI Litellm Model Prices Backup

Overview

Description

The litellm/model_prices_and_context_window_backup.json file is a bundled backup copy of the primary model pricing database (model_prices_and_context_window.json at the repository root). This file is included in the LiteLLM Python package distribution and serves as a local fallback when the remote model cost map hosted on GitHub cannot be fetched.

The file contains pricing data, context window limits, capability flags, and provider routing information for thousands of models across 100+ LLM providers. It is identical in structure and content to the root-level model_prices_and_context_window.json file; a CI check (ci_cd/check_files_match.py) verifies the two files remain synchronized.

Usage

This backup file is loaded by the GetModelCostMap class in litellm/litellm_core_utils/get_model_cost_map.py. It is used in three scenarios:

When the environment variable LITELLM_LOCAL_MODEL_COST_MAP=True is set, forcing local-only mode
When the remote fetch from GitHub fails (network error, timeout, etc.)
When the fetched remote data fails integrity validation (e.g., unexpectedly small model count)

# Loading the backup (from get_model_cost_map.py):
from importlib.resources import files

content = json.loads(
    files("litellm")
    .joinpath("model_prices_and_context_window_backup.json")
    .read_text(encoding="utf-8")
)

Data Schema

Top-Level Structure

The file is a single JSON object where each key is a model identifier string and each value is an object describing that model's properties.

{
    "sample_spec": { ... },           // Self-documenting schema spec
    "model_identifier": {             // e.g., "gpt-4", "claude-3-opus-20240229"
        "litellm_provider": "...",
        "input_cost_per_token": 0.0,
        "output_cost_per_token": 0.0,
        "max_tokens": 8192,
        ...
    },
    ...
}

Special Entry: sample_spec

The first entry, "sample_spec", is a self-documenting schema definition that describes all possible fields and their meanings. It is not a real model entry.

Schema Fields

Core Fields

Field	Type	Required	Description
`litellm_provider`	String	Yes	Provider identifier (e.g., `"openai"`, `"anthropic"`, `"bedrock"`). Must match a provider from the LiteLLM docs.
`mode`	String	Yes	Operation mode: `"chat"`, `"embedding"`, `"completion"`, `"image_generation"`, `"audio_transcription"`, `"audio_speech"`, `"moderation"`, `"rerank"`, `"search"`
`max_tokens`	Integer	No	LEGACY parameter. Set to max_output_tokens or max_input_tokens as applicable.
`max_input_tokens`	Integer	No	Maximum input tokens supported by the model
`max_output_tokens`	Integer	No	Maximum output tokens supported by the model

Pricing Fields

Field	Type	Description
`input_cost_per_token`	Float	Cost per input token in USD
`output_cost_per_token`	Float	Cost per output token in USD
`output_cost_per_reasoning_token`	Float	Cost per reasoning/thinking token in USD
`input_cost_per_audio_token`	Float	Cost per audio input token in USD
`output_cost_per_image`	Float	Cost per generated image in USD
`input_cost_per_pixel`	Float	Cost per input pixel for image models
`output_cost_per_pixel`	Float	Cost per output pixel
`code_interpreter_cost_per_session`	Float	Cost per code interpreter session
`file_search_cost_per_1k_calls`	Float	Cost per 1,000 file search calls
`file_search_cost_per_gb_per_day`	Float	File search storage cost per GB per day
`vector_store_cost_per_gb_per_day`	Float	Vector store storage cost per GB per day
`computer_use_input_cost_per_1k_tokens`	Float	Cost per 1K tokens for computer use input
`computer_use_output_cost_per_1k_tokens`	Float	Cost per 1K tokens for computer use output

Search Context Cost

Field	Type	Description
`search_context_cost_per_query`	Object	Nested object with tiered search costs
`search_context_cost_per_query.search_context_size_low`	Float	Cost for low search context
`search_context_cost_per_query.search_context_size_medium`	Float	Cost for medium search context
`search_context_cost_per_query.search_context_size_high`	Float	Cost for high search context

Capability Flags

Field	Type	Description
`supports_function_calling`	Boolean	Model supports function/tool calling
`supports_parallel_function_calling`	Boolean	Model supports parallel function calls
`supports_vision`	Boolean	Model supports image/vision input
`supports_audio_input`	Boolean	Model supports audio input
`supports_audio_output`	Boolean	Model supports audio output
`supports_prompt_caching`	Boolean	Model supports prompt caching
`supports_reasoning`	Boolean	Model supports reasoning/thinking tokens
`supports_response_schema`	Boolean	Model supports structured response schemas
`supports_system_messages`	Boolean	Model supports system messages
`supports_web_search`	Boolean	Model supports web search

Additional Fields

Field	Type	Description
`deprecation_date`	String	Date model becomes deprecated (format: `YYYY-MM-DD`)
`supported_regions`	Array[String]	List of supported deployment regions (e.g., `"us-west-2"`, `"eu-west-1"`)

Usage Examples

How the Backup is Loaded at Startup

During litellm.__init__, the get_model_cost_map() function is called. If the remote fetch fails or is disabled, it falls back to this file:

# From litellm/litellm_core_utils/get_model_cost_map.py:
def get_model_cost_map(url: str) -> dict:
    if os.getenv("LITELLM_LOCAL_MODEL_COST_MAP", "").lower() == "true":
        return GetModelCostMap.load_local_model_cost_map()
    try:
        content = GetModelCostMap.fetch_remote_model_cost_map(url)
    except Exception:
        return GetModelCostMap.load_local_model_cost_map()
    # Validate integrity...
    if not GetModelCostMap.validate_model_cost_map(...):
        return GetModelCostMap.load_local_model_cost_map()
    return content

Integrity Validation

The backup model count is used to validate remote fetches. If the remote data has significantly fewer models than the backup, it is rejected:

# Validation checks:
# 1. Fetched map must be a non-empty dict
# 2. Model count must not drop below MODEL_COST_MAP_MIN_MODEL_COUNT
# 3. Model count must not shrink more than MODEL_COST_MAP_MAX_SHRINK_RATIO vs backup

CI Sync Check

A CI script ensures the root-level file and this backup stay in sync:

# From ci_cd/check_files_match.py:
# "Comparing model_prices_and_context_window and
#  litellm/model_prices_and_context_window_backup.json files...
#  checking if they match."

Related Pages

Model Prices Database - The primary root-level version of this file
CircleCI Config - CI pipeline that validates both files stay in sync

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment