Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Workflow:Helicone Helicone LLM Response Normalization

From Leeroopedia
Knowledge Sources
Domains LLM_Ops, Data_Normalization, Multi_Provider
Last Updated 2026-02-14 06:00 GMT

Overview

End-to-end process for normalizing heterogeneous LLM provider request and response formats into a unified schema using the llm-mapper package, and for transforming requests between provider formats in the AI Gateway.

Description

This workflow describes how Helicone handles the diversity of LLM API formats across 20+ providers. The llm-mapper package provides two complementary subsystems: mappers that convert provider-specific responses into a standardized LlmSchema for storage and display, and transforms that convert between provider request/response formats to enable cross-provider routing in the AI Gateway.

The mapper subsystem supports OpenAI Chat Completions, Anthropic Messages, Google Gemini, DALL-E image generation, OpenAI Embeddings, OpenAI Realtime (WebSocket), OpenAI Assistants, OpenAI Responses API, Black Forest Labs (FLUX) image generation, Vercel AI SDK, Meta Llama, vector database operations, and custom tool/data events. Each mapper extracts messages, token usage, model information, and metadata into the unified LlmSchema type.

The transform subsystem enables the AI Gateway to route requests to any provider regardless of the client's request format. For example, a client sending an OpenAI-formatted request can be routed to Anthropic or Google by transforming the request format before forwarding, and transforming the response back before returning it to the client. Transforms support both streaming and non-streaming responses.

The package has two mapper generations: v1 uses imperative per-provider implementations, while v2 uses a declarative MapperBuilder/PathMapper system that defines field mappings through a fluent API.

Usage

This workflow executes automatically within the Helicone platform at multiple points: during log processing (to normalize responses for storage and display), in the AI Gateway (to transform requests between provider formats for cross-provider routing), and in the web dashboard (to render request/response content in a human-readable format regardless of the original provider).

Execution Steps

Step 1: Mapper Type Detection

When a request/response pair needs to be processed, the system first determines which mapper to use. The getMapperType utility examines the request URL path, target provider, and response structure to identify the provider and API type. This maps to one of 20+ mapper types: openai-chat, anthropic-chat, gemini-chat, openai-dalle, openai-embedding, openai-realtime, openai-assistant, openai-responses, black-forest-labs-image, vector-db, tool, data, and others.

Key considerations:

  • URL pattern matching is the primary detection method
  • Some providers share the OpenAI-compatible format and use the openai-chat mapper
  • The detection handles edge cases like custom Helicone endpoints for tool and data logging

Step 2: Request Parsing

The selected mapper's request parser extracts structured information from the raw request body. This produces a normalized representation of the input including messages (with role, content, and media attachments), model name, temperature, max tokens, and other parameters. Each provider has its own request structure that must be correctly parsed.

Key considerations:

  • Anthropic uses a system prompt separate from messages, while OpenAI includes it as a system role message
  • Google Gemini uses a contents array with parts, different from OpenAI's messages array
  • Image and audio content must be extracted and tagged with appropriate media types
  • Tool/function call definitions are parsed into a unified tool representation

Step 3: Response Parsing

The mapper's response parser processes the provider's response body to extract the output messages, token usage statistics, finish reason, and any error information. For streaming responses, the accumulated chunks are reassembled into a complete response before parsing.

Key considerations:

  • Streaming responses store accumulated Server-Sent Event (SSE) chunks that must be merged
  • Anthropic returns usage in a top-level usage object with input_tokens/output_tokens
  • OpenAI returns usage in usage.prompt_tokens/completion_tokens
  • Google Gemini returns usage in usageMetadata with promptTokenCount/candidatesTokenCount
  • Error responses from each provider have different structures that must be normalized

Step 4: Schema Mapping

The parsed request and response data are combined into the unified LlmSchema type. This schema provides a consistent structure regardless of the original provider: a messages array with standardized roles (user, assistant, system, tool), token usage counts, model identification, latency metrics, and metadata. The v2 MapperBuilder system uses declarative path mappings to perform this transformation, while v1 mappers use imperative code.

Key considerations:

  • The LlmSchema type includes: request messages, response messages, model, usage, error, and metadata
  • v2 PathMapper defines bidirectional mappings using dot-notation paths
  • Content types are normalized: text, image_url, tool_call, tool_result across all providers

Step 5: Cross_Provider Transform

When the AI Gateway routes a request to a different provider than the client expects, the transform layer converts the request format. For example, converting an OpenAI Chat Completions request to an Anthropic Messages request involves restructuring the messages array, converting tool definitions, mapping response_format to Anthropic equivalents, and handling streaming protocol differences. The response is then transformed back to the client's expected format.

Key considerations:

  • Supported transforms: OpenAI to Anthropic, OpenAI to Google, and their reverses
  • Streaming transforms must convert between SSE chunk formats in real-time
  • Tool/function calling schemas differ significantly between providers
  • The Responses API format requires a separate conversion layer to/from Chat Completions

Execution Diagram

GitHub URL

Workflow Repository