Implementation:Helicone Helicone CostOfPrompt
| Knowledge Sources | |
|---|---|
| Domains | Cost Calculation, LLM Observability, Financial Analytics |
| Last Updated | 2026-02-14 00:00 GMT |
Overview
Concrete functions for computing USD cost from token usage and rates, provided by the @helicone/cost package. Includes both the legacy costOfPrompt flat-rate calculator and the new modelCostBreakdownFromRegistry tiered pricing calculator.
Description
Helicone provides two cost computation functions representing different generations of the pricing system:
Legacy: costOfPrompt (in packages/cost/index.ts, lines 61-153) takes explicit token counts and a provider/model pair, looks up flat per-token rates via costOf, and returns a single number | null representing total USD cost. It handles prompt tokens, cache write tokens (with Anthropic's 5m/1h subtraction to avoid double-counting), cache read tokens, audio tokens (input and output), completion tokens, image costs, and per-call fees. An optional multiple parameter applies integer rounding for ClickHouse precision.
New: modelCostBreakdownFromRegistry (in packages/cost/costCalc.ts, lines 51-65, delegating to calculateModelCostBreakdown in packages/cost/models/calculate-cost.ts, lines 169-287) takes a ModelUsage object, provider, and model ID, looks up the model configuration from the new registry, and returns a detailed CostBreakdown object. It supports tiered pricing (rates that change at token count thresholds), per-modality breakdowns (image, audio, video, file), thinking token costs, web search costs, and request-level fees. Provider-specific threshold functions determine which token count drives tier selection (e.g., Anthropic uses total prompt length, Vertex uses input tokens, Google AI Studio uses input + cached).
Usage
Use costOfPrompt when working with the legacy cost pipeline or when you have pre-extracted token counts as individual numbers. Use modelCostBreakdownFromRegistry for new implementations that need tiered pricing support, detailed cost breakdowns, or modality-specific cost attribution.
Code Reference
Source Location
- Repository: Helicone
- File (legacy):
packages/cost/index.ts(lines 61-153) - File (new entry):
packages/cost/costCalc.ts(lines 51-65) - File (new core):
packages/cost/models/calculate-cost.ts(lines 169-287)
Signature
// Legacy flat-rate calculator
export function costOfPrompt({
provider,
model,
promptTokens,
promptCacheWriteTokens,
promptCacheReadTokens,
promptAudioTokens,
completionTokens,
completionAudioTokens,
promptCacheWrite5m,
promptCacheWrite1h,
images,
perCall,
multiple,
}: {
provider: string;
model: string;
promptTokens: number;
promptCacheWriteTokens: number;
promptCacheReadTokens: number;
promptAudioTokens: number;
completionTokens: number;
completionAudioTokens: number;
promptCacheWrite5m?: number;
promptCacheWrite1h?: number;
images?: number; // default: 1
perCall?: number; // default: 1
multiple?: number;
}): number | null
// New tiered pricing calculator
export function modelCostBreakdownFromRegistry(params: {
modelUsage: ModelUsage;
provider: ModelProviderName;
providerModelId: string;
requestCount?: number;
}): CostBreakdown | null
Import
// Legacy
import { costOfPrompt } from "@helicone/cost";
// New
import { modelCostBreakdownFromRegistry } from "@helicone/cost";
import type { CostBreakdown, ModelUsage, ModelProviderName } from "@helicone/cost";
I/O Contract
Inputs (Legacy: costOfPrompt)
| Name | Type | Required | Description |
|---|---|---|---|
| provider | string |
Yes | Provider name or URL |
| model | string |
Yes | Model identifier |
| promptTokens | number |
Yes | Fresh (uncached) prompt tokens |
| promptCacheWriteTokens | number |
Yes | Cache write tokens |
| promptCacheReadTokens | number |
Yes | Cache read/hit tokens |
| promptAudioTokens | number |
Yes | Input audio tokens |
| completionTokens | number |
Yes | Output/completion tokens |
| completionAudioTokens | number |
Yes | Output audio tokens |
| promptCacheWrite5m | number |
No | Anthropic 5-minute cache creation tokens |
| promptCacheWrite1h | number |
No | Anthropic 1-hour cache creation tokens |
| images | number |
No | Number of images (default: 1) |
| perCall | number |
No | Number of calls for per-call pricing (default: 1) |
| multiple | number |
No | If set, rounds result to integer after multiplying (used for ClickHouse precision) |
Inputs (New: modelCostBreakdownFromRegistry)
| Name | Type | Required | Description |
|---|---|---|---|
| modelUsage | ModelUsage |
Yes | Normalized usage object from a usage processor |
| provider | ModelProviderName |
Yes | Provider key (e.g., "openai", "anthropic", "vertex") |
| providerModelId | string |
Yes | The provider-specific model identifier |
| requestCount | number |
No | Number of requests for per-request pricing (default: 1) |
Outputs (Legacy)
| Name | Type | Description |
|---|---|---|
| return | number ¦ null |
Total cost in USD, or null if model/provider not found in cost registry |
Outputs (New: CostBreakdown)
| Field | Type | Description |
|---|---|---|
| inputCost | number |
Cost of input/prompt tokens |
| outputCost | number |
Cost of output/completion tokens |
| cachedInputCost | number |
Cost of cached input tokens |
| cacheWrite5mCost | number |
Cost of Anthropic 5-minute cache creation |
| cacheWrite1hCost | number |
Cost of Anthropic 1-hour cache creation |
| thinkingCost | number |
Cost of thinking/reasoning tokens |
| image | ModalityCostBreakdown ¦ undefined |
Image modality costs (inputCost, cachedInputCost, outputCost) |
| audio | ModalityCostBreakdown ¦ undefined |
Audio modality costs |
| video | ModalityCostBreakdown ¦ undefined |
Video modality costs |
| file | ModalityCostBreakdown ¦ undefined |
File modality costs |
| webSearchCost | number |
Cost of web search operations |
| requestCost | number |
Fixed per-request cost |
| totalCost | number |
Sum of all cost components (or provider-reported cost if available) |
Usage Examples
Basic Usage (Legacy)
import { costOfPrompt } from "@helicone/cost";
const cost = costOfPrompt({
provider: "OPENAI",
model: "gpt-4o",
promptTokens: 1000,
promptCacheWriteTokens: 0,
promptCacheReadTokens: 0,
promptAudioTokens: 0,
completionTokens: 500,
completionAudioTokens: 0,
});
if (cost !== null) {
console.log(`Total cost: $${cost.toFixed(6)}`);
}
Detailed Breakdown (New)
import { modelCostBreakdownFromRegistry, getUsageProcessor } from "@helicone/cost";
import type { ModelProviderName } from "@helicone/cost";
const provider: ModelProviderName = "anthropic";
const processor = getUsageProcessor(provider);
if (processor) {
const usageResult = await processor.parse({
responseBody: JSON.stringify(response),
model: "claude-sonnet-4-20250514",
isStream: false,
});
if (usageResult.data) {
const breakdown = modelCostBreakdownFromRegistry({
modelUsage: usageResult.data,
provider,
providerModelId: "claude-sonnet-4-20250514",
});
if (breakdown) {
console.log("Input cost:", breakdown.inputCost);
console.log("Output cost:", breakdown.outputCost);
console.log("Cached input cost:", breakdown.cachedInputCost);
console.log("Thinking cost:", breakdown.thinkingCost);
console.log("Total cost:", breakdown.totalCost);
}
}
}
With Cache Tokens (Legacy)
const cost = costOfPrompt({
provider: "ANTHROPIC",
model: "claude-sonnet-4-20250514",
promptTokens: 500,
promptCacheWriteTokens: 2000,
promptCacheReadTokens: 1500,
promptAudioTokens: 0,
completionTokens: 300,
completionAudioTokens: 0,
promptCacheWrite5m: 1000,
promptCacheWrite1h: 500,
});
// Cache write tokens are split: 1000 at 5m rate, 500 at 1h rate,
// remaining 500 (2000 - 1000 - 500) at base cache write rate