Overview
OpenAPI 3.0 specification defining the REST API contract for the Helicone AI Gateway, a Rust-based LLM proxy and router service.
Description
This JSON file is the machine-readable OpenAPI 3.0 specification for the AI Gateway API, served at https://ai-gateway.helicone.ai. The spec is derived from Zod schemas in the Rust-based AI Gateway service and defines three primary endpoints for chat completions, responses, and model listing. It describes the full request/response schemas including message types (developer, system, user, assistant, tool), tool definitions, structured output schemas, and streaming configuration. The specification is consumed by Mintlify to auto-generate interactive API documentation on the Helicone docs site.
Usage
Use this specification to understand the AI Gateway's REST API contract. It is referenced by docs/docs.json as one of two OpenAPI specs powering the auto-generated API reference documentation. Client SDKs and integration tools can also consume this spec for code generation.
Code Reference
Source Location
Spec Structure
| Section |
Description
|
info |
Title: "Helicone AI Gateway API", version 1.0.0
|
servers |
Base URL: https://ai-gateway.helicone.ai
|
paths |
Three endpoint definitions (see below)
|
Endpoints
| Method |
Path |
Summary
|
| POST |
/v1/chat/completions |
Create Chat Completion - OpenAI-compatible chat completions endpoint supporting messages, tools, structured outputs, streaming, and provider-specific parameters
|
| POST |
/v1/responses |
Create Response - Responses API endpoint for generating LLM responses
|
| GET |
/v1/models |
List Models - Retrieve available models from the AI Gateway
|
Request Schema: Chat Completions
The /v1/chat/completions endpoint accepts a rich request body with the following key properties:
{
"messages": [ ... ], // Array of message objects (developer, system, user, assistant, tool)
"model": "string", // Required model identifier
"temperature": 0.7, // Sampling temperature
"top_p": 1.0, // Nucleus sampling
"top_k": null, // Top-K sampling
"stream": false, // Enable streaming
"tools": [ ... ], // Tool/function definitions
"tool_choice": "auto", // Tool selection strategy
"response_format": { ... },// Structured output format (json_object, json_schema, text)
"metadata": { ... }, // Custom metadata
"service_tier": "auto", // Service tier (auto, default, flex, scale, priority)
"reasoning": { ... }, // Reasoning/thinking configuration
"max_tokens": 4096, // Maximum output tokens
"stop": [ ... ] // Stop sequences
}
Message Types
| Role |
Description |
Content Types
|
developer |
Developer/system instructions |
text, text array
|
system |
System prompt |
text, text array
|
user |
User input |
text, image_url, document
|
assistant |
Assistant response |
text, tool_call, refusal
|
tool |
Tool/function results |
text
|
I/O Contract
Inputs
| Name |
Type |
Required |
Description
|
| messages |
array |
Yes |
Array of message objects with role and content
|
| model |
string |
Yes |
Model identifier (e.g., "gpt-4o", "claude-3-sonnet")
|
| temperature |
number |
No |
Sampling temperature (0-2)
|
| top_p |
number |
No |
Nucleus sampling parameter
|
| stream |
boolean |
No |
Whether to stream the response
|
| tools |
array |
No |
Tool/function definitions for function calling
|
| response_format |
object |
No |
Structured output format specification
|
Outputs
| Name |
Type |
Description
|
| choices |
array |
Array of completion choices with message content
|
| usage |
object |
Token usage statistics (prompt_tokens, completion_tokens, total_tokens)
|
| model |
string |
Model used for the completion
|
| id |
string |
Unique completion identifier
|
Spec Details
- OpenAPI Version: 3.0.0
- Total Lines: 4,068
- Server:
https://ai-gateway.helicone.ai
- Format: JSON, derived from Zod schemas
- Consumer: Mintlify documentation system via
docs/docs.json
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.