Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Helicone Helicone AI Gateway OpenAPI Spec

From Leeroopedia
Revision as of 12:56, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Helicone_Helicone_AI_Gateway_OpenAPI_Spec.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains API Specification, AI Gateway
Last Updated 2026-02-14 06:32 GMT

Overview

OpenAPI 3.0 specification defining the REST API contract for the Helicone AI Gateway, a Rust-based LLM proxy and router service.

Description

This JSON file is the machine-readable OpenAPI 3.0 specification for the AI Gateway API, served at https://ai-gateway.helicone.ai. The spec is derived from Zod schemas in the Rust-based AI Gateway service and defines three primary endpoints for chat completions, responses, and model listing. It describes the full request/response schemas including message types (developer, system, user, assistant, tool), tool definitions, structured output schemas, and streaming configuration. The specification is consumed by Mintlify to auto-generate interactive API documentation on the Helicone docs site.

Usage

Use this specification to understand the AI Gateway's REST API contract. It is referenced by docs/docs.json as one of two OpenAPI specs powering the auto-generated API reference documentation. Client SDKs and integration tools can also consume this spec for code generation.

Code Reference

Source Location

Spec Structure

Section Description
info Title: "Helicone AI Gateway API", version 1.0.0
servers Base URL: https://ai-gateway.helicone.ai
paths Three endpoint definitions (see below)

Endpoints

Method Path Summary
POST /v1/chat/completions Create Chat Completion - OpenAI-compatible chat completions endpoint supporting messages, tools, structured outputs, streaming, and provider-specific parameters
POST /v1/responses Create Response - Responses API endpoint for generating LLM responses
GET /v1/models List Models - Retrieve available models from the AI Gateway

Request Schema: Chat Completions

The /v1/chat/completions endpoint accepts a rich request body with the following key properties:

{
  "messages": [ ... ],       // Array of message objects (developer, system, user, assistant, tool)
  "model": "string",         // Required model identifier
  "temperature": 0.7,        // Sampling temperature
  "top_p": 1.0,              // Nucleus sampling
  "top_k": null,             // Top-K sampling
  "stream": false,           // Enable streaming
  "tools": [ ... ],          // Tool/function definitions
  "tool_choice": "auto",     // Tool selection strategy
  "response_format": { ... },// Structured output format (json_object, json_schema, text)
  "metadata": { ... },       // Custom metadata
  "service_tier": "auto",    // Service tier (auto, default, flex, scale, priority)
  "reasoning": { ... },      // Reasoning/thinking configuration
  "max_tokens": 4096,        // Maximum output tokens
  "stop": [ ... ]            // Stop sequences
}

Message Types

Role Description Content Types
developer Developer/system instructions text, text array
system System prompt text, text array
user User input text, image_url, document
assistant Assistant response text, tool_call, refusal
tool Tool/function results text

I/O Contract

Inputs

Name Type Required Description
messages array Yes Array of message objects with role and content
model string Yes Model identifier (e.g., "gpt-4o", "claude-3-sonnet")
temperature number No Sampling temperature (0-2)
top_p number No Nucleus sampling parameter
stream boolean No Whether to stream the response
tools array No Tool/function definitions for function calling
response_format object No Structured output format specification

Outputs

Name Type Description
choices array Array of completion choices with message content
usage object Token usage statistics (prompt_tokens, completion_tokens, total_tokens)
model string Model used for the completion
id string Unique completion identifier

Spec Details

  • OpenAPI Version: 3.0.0
  • Total Lines: 4,068
  • Server: https://ai-gateway.helicone.ai
  • Format: JSON, derived from Zod schemas
  • Consumer: Mintlify documentation system via docs/docs.json

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment