Heuristic:Helicone Helicone Provider URL Regex Priority
| Knowledge Sources | |
|---|---|
| Domains | Provider_Detection, URL_Routing |
| Last Updated | 2026-02-14 06:00 GMT |
Overview
Provider detection uses regex patterns matched sequentially against target URLs; more specific patterns must be listed before broader ones to prevent mismatches.
Description
When a request passes through the Helicone proxy, the system must identify which LLM provider is being targeted. This is done by matching the target URL against a list of regex patterns defined in `packages/cost/providers/mappings.ts`. The patterns are evaluated in definition order, and the first match wins. This means that more specific patterns (e.g., Azure OpenAI with multiple domain variants) must appear before broader patterns (e.g., generic googleapis.com) to prevent false matches.
Usage
Apply this heuristic when adding a new LLM provider to Helicone. New provider URL patterns must be placed in the correct priority position in the regex list. Incorrect ordering can cause requests to one provider to be attributed to another, leading to wrong cost calculations and model detection.
The Insight (Rule of Thumb)
- Action: When adding a new provider URL pattern, place it above any broader pattern that could also match the same URL.
- Value: The matching is first-match-wins. Azure has 4 domain variants; OpenAI handles US data residency (`us.api.openai.com`).
- Trade-off: Adding providers at the bottom is safe for unique domains but dangerous for subdomains of existing patterns.
Known priority traps:
- AWS Bedrock and AWS Nova share the same regex (`bedrock-runtime.*.amazonaws.com`). They are distinguished by provider name, not URL.
- Google APIs (`googleapis.com`) is a broad pattern that catches both Vertex AI and Google AI Studio.
- OpenAI pattern must handle US data residency: `^https:\/\/(us\.)?api\.openai\.com(\/|$)`.
- Local proxy pattern (`127.0.0.1`) must not shadow real provider patterns.
Reasoning
Code evidence from `packages/cost/providers/mappings.ts:31-75`:
// Matches both standard api.openai.com and US data residency us.api.openai.com
const openAiPattern = /^https:\/\/(us\.)?api\.openai\.com(\/|$)/;
const anthropicPattern = /^https:\/\/api\.anthropic\.com/;
export const azurePattern =
/^(https?:\/\/)?([^.]*\.)?(openai\.azure\.com|azure-api\.net|cognitiveservices\.azure\.com|services\.ai\.azure\.com)(\/.*)?$/;
const llamaApiPattern = /^https:\/\/api\.llama\.com/;
const nvidiaApiPattern = /^https:\/\/integrate\.api\.nvidia\.com/;
const localProxyPattern = /^http:\/\/127\.0\.0\.1:\d+\/v\d+\/?$/;
// Broad patterns that could shadow specific ones
const googleapis = /^https:\/\/(.*\.)?googleapis\.com/;
// AWS Bedrock and Nova share same regex
const awsBedrock = /^https:\/\/bedrock-runtime\.[a-z0-9-]+\.amazonaws\.com\/.*/;
const awsNova = /^https:\/\/bedrock-runtime\.[a-z0-9-]+\.amazonaws\.com\/.*/;
The comment `// Matches both standard api.openai.com and US data residency us.api.openai.com` reveals that this was a deliberate decision after discovering that US-residency requests were not being matched.