Workflow:Helicone Helicone LLM Request Proxy Logging

Knowledge Sources	Helicone Helicone Docs Gateway Overview
Domains	LLM_Ops, Observability, Proxy
Last Updated	2026-02-14 06:00 GMT

Overview

End-to-end process for intercepting LLM API requests through the Helicone proxy, forwarding them to any supported provider, and logging the full request/response lifecycle into the observability pipeline.

Description

This workflow describes the core data flow of the Helicone platform. A client application sends an LLM request to a Helicone proxy worker (deployed as a Cloudflare Worker) instead of directly to the LLM provider. The proxy transparently forwards the request, captures the response, and asynchronously logs the full interaction through a queue into the Jawn backend. Jawn processes the log by normalizing the provider format, calculating costs, storing request/response bodies in object storage, and inserting analytics data into ClickHouse. The logged data is then viewable through the Next.js web dashboard with filtering, search, and analytics capabilities.

The workflow supports all major LLM providers (OpenAI, Anthropic, Google, AWS Bedrock, Azure, and 25+ others) through a single proxy endpoint. It handles both streaming and non-streaming responses, caching, rate limiting, session tracking, and custom property tagging via Helicone-specific HTTP headers.

Usage

Execute this workflow whenever you need to observe, monitor, or debug LLM API calls from any application. This is the primary integration path for Helicone users: change the base URL of your LLM client to point to the Helicone proxy (or AI Gateway), add your Helicone API key, and all requests are automatically intercepted and logged without any other code changes.

Execution Steps

Step 1: Client Request Interception

The client application sends an LLM API request to a Helicone proxy endpoint instead of directly to the provider. The Cloudflare Worker receives the request, wraps it in a RequestWrapper that extracts Helicone-specific headers (authentication, target URL, session ID, custom properties, cache control, rate limit settings), and determines which router to use based on the worker type and hostname.

Key considerations:

The proxy supports multiple worker types: OPENAI_PROXY, ANTHROPIC_PROXY, GATEWAY_API, AI_GATEWAY_API, and HELICONE_API
Authentication is provided via the Helicone-Auth header with a Bearer token
For the universal gateway, the target provider is specified via the Helicone-Target-Url header
The AI Gateway uses model-based routing with automatic provider selection

Step 2: Request Routing and Validation

The router factory selects the appropriate handler based on the worker type. The handler validates the Helicone API key against the database, checks rate limits if configured, and looks up cached responses if caching is enabled. If a valid cached response exists, it is returned immediately without forwarding to the provider.

Key considerations:

Rate limiting uses bucket-based token tracking
Cache keys are derived from the request body hash and configured cache parameters
The AI Gateway routing resolves model names to specific provider endpoints using the model registry

Step 3: Provider Communication

The ProxyForwarder constructs the outbound request to the LLM provider. For the AI Gateway, this involves transforming the request format if the target provider uses a different API format (e.g., converting OpenAI format to Anthropic format). The request is forwarded to the provider, and the response is streamed back to the client in real time.

Key considerations:

Streaming responses are processed chunk-by-chunk through body processors
The transform layer handles cross-provider format conversion (OpenAI to Anthropic, OpenAI to Google, etc.)
Response metadata (status code, headers, timing) is captured during forwarding

Step 4: Asynchronous Log Queuing

After the response is fully delivered to the client, the worker asynchronously queues a log message containing the full request/response data. The message is sent to an Upstash Redis queue (or Kafka in some deployments) with the complete request body, response body, timing metadata, Helicone headers, and authentication context.

Key considerations:

Logging is asynchronous to avoid adding latency to the client response
The message format includes request/response bodies, timing data, and authentication context
Failed log deliveries are retried with exponential backoff

Step 5: Backend Log Processing

The Jawn backend (Express.js server) consumes messages from the queue and processes each log entry. Processing involves: authenticating the organization, extracting token usage from the provider-specific response format using the appropriate UsageProcessor, normalizing the response using the LLM Mapper, calculating the request cost using the cost registry, and storing the full request/response bodies in MinIO (S3-compatible object storage).

Key considerations:

Each provider has a dedicated UsageProcessor for extracting token counts from different response formats
The LLM Mapper normalizes responses into a unified LlmSchema for consistent storage and display
Cost calculation uses the model registry with tiered pricing and cache token multipliers

Step 6: Analytics Storage

The processed log entry metadata is inserted into both PostgreSQL (for application data, user lookups, and API key management) and ClickHouse (for high-performance analytics queries). ClickHouse stores the denormalized request/response metadata table that powers the dashboard analytics, including model, provider, token counts, cost, latency, status, and custom properties.

Key considerations:

ClickHouse uses a cost precision multiplier of 1 billion to avoid floating-point issues
Custom properties are stored as JSONB and indexed for efficient filtering
PostgreSQL handles relational data like organizations, API keys, and user accounts

Step 7: Dashboard Visualization

Users access the Next.js web dashboard to view, filter, search, and analyze their logged LLM requests. The dashboard queries the Jawn API, which compiles user-defined filters into ClickHouse SQL using the filter system. Results are displayed with the LLM Mapper for human-readable formatting of request/response content across all provider formats.

Key considerations:

The filter system supports complex queries across request metadata, properties, sessions, and custom fields
Dashboard exports are available in Excel format
The playground allows re-running requests with modified parameters

Execution Diagram

GitHub URL

Workflow Repository