Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Helicone Helicone CostOfPrompt

From Leeroopedia
Knowledge Sources
Domains Cost Calculation, LLM Observability, Financial Analytics
Last Updated 2026-02-14 00:00 GMT

Overview

Concrete functions for computing USD cost from token usage and rates, provided by the @helicone/cost package. Includes both the legacy costOfPrompt flat-rate calculator and the new modelCostBreakdownFromRegistry tiered pricing calculator.

Description

Helicone provides two cost computation functions representing different generations of the pricing system:

Legacy: costOfPrompt (in packages/cost/index.ts, lines 61-153) takes explicit token counts and a provider/model pair, looks up flat per-token rates via costOf, and returns a single number | null representing total USD cost. It handles prompt tokens, cache write tokens (with Anthropic's 5m/1h subtraction to avoid double-counting), cache read tokens, audio tokens (input and output), completion tokens, image costs, and per-call fees. An optional multiple parameter applies integer rounding for ClickHouse precision.

New: modelCostBreakdownFromRegistry (in packages/cost/costCalc.ts, lines 51-65, delegating to calculateModelCostBreakdown in packages/cost/models/calculate-cost.ts, lines 169-287) takes a ModelUsage object, provider, and model ID, looks up the model configuration from the new registry, and returns a detailed CostBreakdown object. It supports tiered pricing (rates that change at token count thresholds), per-modality breakdowns (image, audio, video, file), thinking token costs, web search costs, and request-level fees. Provider-specific threshold functions determine which token count drives tier selection (e.g., Anthropic uses total prompt length, Vertex uses input tokens, Google AI Studio uses input + cached).

Usage

Use costOfPrompt when working with the legacy cost pipeline or when you have pre-extracted token counts as individual numbers. Use modelCostBreakdownFromRegistry for new implementations that need tiered pricing support, detailed cost breakdowns, or modality-specific cost attribution.

Code Reference

Source Location

  • Repository: Helicone
  • File (legacy): packages/cost/index.ts (lines 61-153)
  • File (new entry): packages/cost/costCalc.ts (lines 51-65)
  • File (new core): packages/cost/models/calculate-cost.ts (lines 169-287)

Signature

// Legacy flat-rate calculator
export function costOfPrompt({
  provider,
  model,
  promptTokens,
  promptCacheWriteTokens,
  promptCacheReadTokens,
  promptAudioTokens,
  completionTokens,
  completionAudioTokens,
  promptCacheWrite5m,
  promptCacheWrite1h,
  images,
  perCall,
  multiple,
}: {
  provider: string;
  model: string;
  promptTokens: number;
  promptCacheWriteTokens: number;
  promptCacheReadTokens: number;
  promptAudioTokens: number;
  completionTokens: number;
  completionAudioTokens: number;
  promptCacheWrite5m?: number;
  promptCacheWrite1h?: number;
  images?: number;       // default: 1
  perCall?: number;      // default: 1
  multiple?: number;
}): number | null

// New tiered pricing calculator
export function modelCostBreakdownFromRegistry(params: {
  modelUsage: ModelUsage;
  provider: ModelProviderName;
  providerModelId: string;
  requestCount?: number;
}): CostBreakdown | null

Import

// Legacy
import { costOfPrompt } from "@helicone/cost";

// New
import { modelCostBreakdownFromRegistry } from "@helicone/cost";
import type { CostBreakdown, ModelUsage, ModelProviderName } from "@helicone/cost";

I/O Contract

Inputs (Legacy: costOfPrompt)

Name Type Required Description
provider string Yes Provider name or URL
model string Yes Model identifier
promptTokens number Yes Fresh (uncached) prompt tokens
promptCacheWriteTokens number Yes Cache write tokens
promptCacheReadTokens number Yes Cache read/hit tokens
promptAudioTokens number Yes Input audio tokens
completionTokens number Yes Output/completion tokens
completionAudioTokens number Yes Output audio tokens
promptCacheWrite5m number No Anthropic 5-minute cache creation tokens
promptCacheWrite1h number No Anthropic 1-hour cache creation tokens
images number No Number of images (default: 1)
perCall number No Number of calls for per-call pricing (default: 1)
multiple number No If set, rounds result to integer after multiplying (used for ClickHouse precision)

Inputs (New: modelCostBreakdownFromRegistry)

Name Type Required Description
modelUsage ModelUsage Yes Normalized usage object from a usage processor
provider ModelProviderName Yes Provider key (e.g., "openai", "anthropic", "vertex")
providerModelId string Yes The provider-specific model identifier
requestCount number No Number of requests for per-request pricing (default: 1)

Outputs (Legacy)

Name Type Description
return number ¦ null Total cost in USD, or null if model/provider not found in cost registry

Outputs (New: CostBreakdown)

Field Type Description
inputCost number Cost of input/prompt tokens
outputCost number Cost of output/completion tokens
cachedInputCost number Cost of cached input tokens
cacheWrite5mCost number Cost of Anthropic 5-minute cache creation
cacheWrite1hCost number Cost of Anthropic 1-hour cache creation
thinkingCost number Cost of thinking/reasoning tokens
image ModalityCostBreakdown ¦ undefined Image modality costs (inputCost, cachedInputCost, outputCost)
audio ModalityCostBreakdown ¦ undefined Audio modality costs
video ModalityCostBreakdown ¦ undefined Video modality costs
file ModalityCostBreakdown ¦ undefined File modality costs
webSearchCost number Cost of web search operations
requestCost number Fixed per-request cost
totalCost number Sum of all cost components (or provider-reported cost if available)

Usage Examples

Basic Usage (Legacy)

import { costOfPrompt } from "@helicone/cost";

const cost = costOfPrompt({
  provider: "OPENAI",
  model: "gpt-4o",
  promptTokens: 1000,
  promptCacheWriteTokens: 0,
  promptCacheReadTokens: 0,
  promptAudioTokens: 0,
  completionTokens: 500,
  completionAudioTokens: 0,
});

if (cost !== null) {
  console.log(`Total cost: $${cost.toFixed(6)}`);
}

Detailed Breakdown (New)

import { modelCostBreakdownFromRegistry, getUsageProcessor } from "@helicone/cost";
import type { ModelProviderName } from "@helicone/cost";

const provider: ModelProviderName = "anthropic";
const processor = getUsageProcessor(provider);

if (processor) {
  const usageResult = await processor.parse({
    responseBody: JSON.stringify(response),
    model: "claude-sonnet-4-20250514",
    isStream: false,
  });

  if (usageResult.data) {
    const breakdown = modelCostBreakdownFromRegistry({
      modelUsage: usageResult.data,
      provider,
      providerModelId: "claude-sonnet-4-20250514",
    });

    if (breakdown) {
      console.log("Input cost:", breakdown.inputCost);
      console.log("Output cost:", breakdown.outputCost);
      console.log("Cached input cost:", breakdown.cachedInputCost);
      console.log("Thinking cost:", breakdown.thinkingCost);
      console.log("Total cost:", breakdown.totalCost);
    }
  }
}

With Cache Tokens (Legacy)

const cost = costOfPrompt({
  provider: "ANTHROPIC",
  model: "claude-sonnet-4-20250514",
  promptTokens: 500,
  promptCacheWriteTokens: 2000,
  promptCacheReadTokens: 1500,
  promptAudioTokens: 0,
  completionTokens: 300,
  completionAudioTokens: 0,
  promptCacheWrite5m: 1000,
  promptCacheWrite1h: 500,
});
// Cache write tokens are split: 1000 at 5m rate, 500 at 1h rate,
// remaining 500 (2000 - 1000 - 500) at base cache write rate

Related Pages

Implements Principle

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment