Implementation:Helicone Helicone CostOfPrompt

Knowledge Sources	Helicone
Domains	Cost Calculation, LLM Observability, Financial Analytics
Last Updated	2026-02-14 00:00 GMT

Overview

Concrete functions for computing USD cost from token usage and rates, provided by the @helicone/cost package. Includes both the legacy costOfPrompt flat-rate calculator and the new modelCostBreakdownFromRegistry tiered pricing calculator.

Description

Helicone provides two cost computation functions representing different generations of the pricing system:

Legacy: costOfPrompt (in packages/cost/index.ts, lines 61-153) takes explicit token counts and a provider/model pair, looks up flat per-token rates via costOf, and returns a single number | null representing total USD cost. It handles prompt tokens, cache write tokens (with Anthropic's 5m/1h subtraction to avoid double-counting), cache read tokens, audio tokens (input and output), completion tokens, image costs, and per-call fees. An optional multiple parameter applies integer rounding for ClickHouse precision.

New: modelCostBreakdownFromRegistry (in packages/cost/costCalc.ts, lines 51-65, delegating to calculateModelCostBreakdown in packages/cost/models/calculate-cost.ts, lines 169-287) takes a ModelUsage object, provider, and model ID, looks up the model configuration from the new registry, and returns a detailed CostBreakdown object. It supports tiered pricing (rates that change at token count thresholds), per-modality breakdowns (image, audio, video, file), thinking token costs, web search costs, and request-level fees. Provider-specific threshold functions determine which token count drives tier selection (e.g., Anthropic uses total prompt length, Vertex uses input tokens, Google AI Studio uses input + cached).

Usage

Use costOfPrompt when working with the legacy cost pipeline or when you have pre-extracted token counts as individual numbers. Use modelCostBreakdownFromRegistry for new implementations that need tiered pricing support, detailed cost breakdowns, or modality-specific cost attribution.

Code Reference

Source Location

Repository: Helicone
File (legacy): packages/cost/index.ts (lines 61-153)
File (new entry): packages/cost/costCalc.ts (lines 51-65)
File (new core): packages/cost/models/calculate-cost.ts (lines 169-287)

Signature

// Legacy flat-rate calculator
export function costOfPrompt({
  provider,
  model,
  promptTokens,
  promptCacheWriteTokens,
  promptCacheReadTokens,
  promptAudioTokens,
  completionTokens,
  completionAudioTokens,
  promptCacheWrite5m,
  promptCacheWrite1h,
  images,
  perCall,
  multiple,
}: {
  provider: string;
  model: string;
  promptTokens: number;
  promptCacheWriteTokens: number;
  promptCacheReadTokens: number;
  promptAudioTokens: number;
  completionTokens: number;
  completionAudioTokens: number;
  promptCacheWrite5m?: number;
  promptCacheWrite1h?: number;
  images?: number;       // default: 1
  perCall?: number;      // default: 1
  multiple?: number;
}): number | null

// New tiered pricing calculator
export function modelCostBreakdownFromRegistry(params: {
  modelUsage: ModelUsage;
  provider: ModelProviderName;
  providerModelId: string;
  requestCount?: number;
}): CostBreakdown | null

Import

// Legacy
import { costOfPrompt } from "@helicone/cost";

// New
import { modelCostBreakdownFromRegistry } from "@helicone/cost";
import type { CostBreakdown, ModelUsage, ModelProviderName } from "@helicone/cost";

I/O Contract

Inputs (Legacy: costOfPrompt)

Name	Type	Required	Description
provider	`string`	Yes	Provider name or URL
model	`string`	Yes	Model identifier
promptTokens	`number`	Yes	Fresh (uncached) prompt tokens
promptCacheWriteTokens	`number`	Yes	Cache write tokens
promptCacheReadTokens	`number`	Yes	Cache read/hit tokens
promptAudioTokens	`number`	Yes	Input audio tokens
completionTokens	`number`	Yes	Output/completion tokens
completionAudioTokens	`number`	Yes	Output audio tokens
promptCacheWrite5m	`number`	No	Anthropic 5-minute cache creation tokens
promptCacheWrite1h	`number`	No	Anthropic 1-hour cache creation tokens
images	`number`	No	Number of images (default: 1)
perCall	`number`	No	Number of calls for per-call pricing (default: 1)
multiple	`number`	No	If set, rounds result to integer after multiplying (used for ClickHouse precision)

Inputs (New: modelCostBreakdownFromRegistry)

Name	Type	Required	Description
modelUsage	`ModelUsage`	Yes	Normalized usage object from a usage processor
provider	`ModelProviderName`	Yes	Provider key (e.g., "openai", "anthropic", "vertex")
providerModelId	`string`	Yes	The provider-specific model identifier
requestCount	`number`	No	Number of requests for per-request pricing (default: 1)

Outputs (Legacy)

Name	Type	Description
return	`number ¦ null`	Total cost in USD, or null if model/provider not found in cost registry

Outputs (New: CostBreakdown)

Field	Type	Description
inputCost	`number`	Cost of input/prompt tokens
outputCost	`number`	Cost of output/completion tokens
cachedInputCost	`number`	Cost of cached input tokens
cacheWrite5mCost	`number`	Cost of Anthropic 5-minute cache creation
cacheWrite1hCost	`number`	Cost of Anthropic 1-hour cache creation
thinkingCost	`number`	Cost of thinking/reasoning tokens
image	`ModalityCostBreakdown ¦ undefined`	Image modality costs (inputCost, cachedInputCost, outputCost)
audio	`ModalityCostBreakdown ¦ undefined`	Audio modality costs
video	`ModalityCostBreakdown ¦ undefined`	Video modality costs
file	`ModalityCostBreakdown ¦ undefined`	File modality costs
webSearchCost	`number`	Cost of web search operations
requestCost	`number`	Fixed per-request cost
totalCost	`number`	Sum of all cost components (or provider-reported cost if available)

Usage Examples

Basic Usage (Legacy)

import { costOfPrompt } from "@helicone/cost";

const cost = costOfPrompt({
  provider: "OPENAI",
  model: "gpt-4o",
  promptTokens: 1000,
  promptCacheWriteTokens: 0,
  promptCacheReadTokens: 0,
  promptAudioTokens: 0,
  completionTokens: 500,
  completionAudioTokens: 0,
});

if (cost !== null) {
  console.log(`Total cost: $${cost.toFixed(6)}`);
}

Detailed Breakdown (New)

import { modelCostBreakdownFromRegistry, getUsageProcessor } from "@helicone/cost";
import type { ModelProviderName } from "@helicone/cost";

const provider: ModelProviderName = "anthropic";
const processor = getUsageProcessor(provider);

if (processor) {
  const usageResult = await processor.parse({
    responseBody: JSON.stringify(response),
    model: "claude-sonnet-4-20250514",
    isStream: false,
  });

  if (usageResult.data) {
    const breakdown = modelCostBreakdownFromRegistry({
      modelUsage: usageResult.data,
      provider,
      providerModelId: "claude-sonnet-4-20250514",
    });

    if (breakdown) {
      console.log("Input cost:", breakdown.inputCost);
      console.log("Output cost:", breakdown.outputCost);
      console.log("Cached input cost:", breakdown.cachedInputCost);
      console.log("Thinking cost:", breakdown.thinkingCost);
      console.log("Total cost:", breakdown.totalCost);
    }
  }
}

With Cache Tokens (Legacy)

const cost = costOfPrompt({
  provider: "ANTHROPIC",
  model: "claude-sonnet-4-20250514",
  promptTokens: 500,
  promptCacheWriteTokens: 2000,
  promptCacheReadTokens: 1500,
  promptAudioTokens: 0,
  completionTokens: 300,
  completionAudioTokens: 0,
  promptCacheWrite5m: 1000,
  promptCacheWrite1h: 500,
});
// Cache write tokens are split: 1000 at 5m rate, 500 at 1h rate,
// remaining 500 (2000 - 1000 - 500) at base cache write rate

Related Pages

Implements Principle

Principle:Helicone_Helicone_Cost_Computation

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment