Heuristic:Helicone Helicone Tiered Pricing Threshold Matching

Knowledge Sources	Helicone Cost Package Internal Engineering
Domains	Cost_Calculation, Pricing
Last Updated	2026-02-14 06:00 GMT

Overview

Tiered pricing matches the highest threshold a token count meets (not the closest), with provider-specific threshold value functions that count different token types.

Description

Many LLM providers (Google, Anthropic, XAI) offer tiered pricing where the per-token rate changes based on the total number of tokens used. The Helicone cost engine determines which tier applies by iterating through all tiers in ascending threshold order and selecting the last tier whose threshold is met or exceeded. This is a highest-match strategy, not a closest-match.

Critically, different providers use different formulas to calculate the threshold value:

Vertex: Uses only `input` tokens
Google AI Studio: Uses `input + cachedInput`
Anthropic: Uses `input + cachedInput + write5m + write1h` (total prompt)
XAI: Uses `input + cachedInput`
Others: Returns 0 (no tiered pricing)

Usage

Apply this heuristic when adding tiered pricing for a new provider or debugging cost calculation discrepancies. Understanding which token types contribute to the threshold is essential for correct tier selection.

The Insight (Rule of Thumb)

Action: When defining tiered pricing, sort thresholds in ascending order. The engine selects the last tier whose threshold is met.
Value: Provider-specific threshold functions: Vertex uses `input` only; Anthropic uses total prompt including cache writes.
Trade-off: Adding a new provider with tiered pricing requires implementing a custom threshold function in `getThresholdValueFunction`. Using the default (returns 0) means all requests match tier 0.

Reasoning

Tier matching algorithm from `packages/cost/models/calculate-cost.ts:96-112`:

function getPricingTier(
  preprocessedPricing: ModelPricing[],
  value: number,
): ModelPricing {
  let matchedTierIndex = 0;
  // Find the highest threshold that the value meets
  for (let i = 0; i < preprocessedPricing.length; i++) {
    if (value >= preprocessedPricing[i].threshold) {
      matchedTierIndex = i;
      // Don't break - continue to find the highest matching threshold
    }
  }
  return preprocessedPricing[matchedTierIndex];
}

The comment `// Don't break - continue to find the highest matching threshold` is the critical tribal knowledge here. A developer unfamiliar with this pattern might add a `break` for optimization, which would cause the engine to return the first matching tier instead of the highest.

Provider-specific threshold functions from `packages/cost/models/calculate-cost.ts:114-167`:

case "vertex":
  return (usage) => usage.input;  // Only input tokens
case "google-ai-studio":
  return (usage) => usage.input + (usage.cacheDetails?.cachedInput ?? 0);  // Input + cached
case "anthropic":
  return (usage) => usage.input +
    (usage.cacheDetails?.cachedInput ?? 0) +
    (usage.cacheDetails?.write5m ?? 0) +
    (usage.cacheDetails?.write1h ?? 0);  // Total prompt

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment