Heuristic:Helicone Helicone Tiered Pricing Threshold Matching
| Knowledge Sources | |
|---|---|
| Domains | Cost_Calculation, Pricing |
| Last Updated | 2026-02-14 06:00 GMT |
Overview
Tiered pricing matches the highest threshold a token count meets (not the closest), with provider-specific threshold value functions that count different token types.
Description
Many LLM providers (Google, Anthropic, XAI) offer tiered pricing where the per-token rate changes based on the total number of tokens used. The Helicone cost engine determines which tier applies by iterating through all tiers in ascending threshold order and selecting the last tier whose threshold is met or exceeded. This is a highest-match strategy, not a closest-match.
Critically, different providers use different formulas to calculate the threshold value:
- Vertex: Uses only `input` tokens
- Google AI Studio: Uses `input + cachedInput`
- Anthropic: Uses `input + cachedInput + write5m + write1h` (total prompt)
- XAI: Uses `input + cachedInput`
- Others: Returns 0 (no tiered pricing)
Usage
Apply this heuristic when adding tiered pricing for a new provider or debugging cost calculation discrepancies. Understanding which token types contribute to the threshold is essential for correct tier selection.
The Insight (Rule of Thumb)
- Action: When defining tiered pricing, sort thresholds in ascending order. The engine selects the last tier whose threshold is met.
- Value: Provider-specific threshold functions: Vertex uses `input` only; Anthropic uses total prompt including cache writes.
- Trade-off: Adding a new provider with tiered pricing requires implementing a custom threshold function in `getThresholdValueFunction`. Using the default (returns 0) means all requests match tier 0.
Reasoning
Tier matching algorithm from `packages/cost/models/calculate-cost.ts:96-112`:
function getPricingTier(
preprocessedPricing: ModelPricing[],
value: number,
): ModelPricing {
let matchedTierIndex = 0;
// Find the highest threshold that the value meets
for (let i = 0; i < preprocessedPricing.length; i++) {
if (value >= preprocessedPricing[i].threshold) {
matchedTierIndex = i;
// Don't break - continue to find the highest matching threshold
}
}
return preprocessedPricing[matchedTierIndex];
}
The comment `// Don't break - continue to find the highest matching threshold` is the critical tribal knowledge here. A developer unfamiliar with this pattern might add a `break` for optimization, which would cause the engine to return the first matching tier instead of the highest.
Provider-specific threshold functions from `packages/cost/models/calculate-cost.ts:114-167`:
case "vertex":
return (usage) => usage.input; // Only input tokens
case "google-ai-studio":
return (usage) => usage.input + (usage.cacheDetails?.cachedInput ?? 0); // Input + cached
case "anthropic":
return (usage) => usage.input +
(usage.cacheDetails?.cachedInput ?? 0) +
(usage.cacheDetails?.write5m ?? 0) +
(usage.cacheDetails?.write1h ?? 0); // Total prompt