Implementation:SeldonIO Seldon core Open Inference Protocol V2 OpenAPI
| Knowledge Sources | |
|---|---|
| Domains | Inference, REST_API, ML_Serving |
| Last Updated | 2026-02-13 14:00 GMT |
Overview
OpenAPI 3.0.3 specification defining the V2 Inference Protocol REST endpoints for Seldon Core 2 inference servers.
Description
The Open Inference Protocol V2 OpenAPI specification is the formal contract for all REST interactions with Seldon Core 2 inference servers. It defines health check endpoints (server live, server ready, model ready), metadata retrieval endpoints (server metadata, model metadata), and inference endpoints (model inference with optional version pinning). The specification uses standard OpenAPI 3.0.3 and includes typed request/response schemas for InferenceRequest, InferenceResponse, MetadataServerResponse, MetadataModelResponse, and associated tensor types.
This specification implements the KServe V2 Inference Protocol (formerly known as the Open Inference Protocol), which provides a standardized interface across heterogeneous model servers such as MLServer and NVIDIA Triton.
Usage
Reference this specification when building HTTP clients that interact with Seldon Core 2 inference servers, when validating request/response payloads, or when generating client SDKs for the inference data plane. The spec is also used by the Seldon CLI internally for constructing inference requests.
Code Reference
Source Location
- Repository: SeldonIO_Seldon_core
- File: docs-gb/apis/inference/open-inference-protocol-v2.openapi.yaml
- Lines: 1-743
Signature
openapi: 3.0.3
info:
title: Data Plane
version: '2.0'
description: REST protocol to interact with inference servers.
paths:
/v2/health/live:
get:
operationId: server-live
/v2/health/ready:
get:
operationId: server-ready
/v2/models/{model_name}/ready:
get:
operationId: model-ready
/v2:
get:
operationId: server-metadata
/v2/models/{model_name}:
get:
operationId: model-metadata
/v2/models/{model_name}/versions/{model_version}:
get:
operationId: model-version-metadata
/v2/models/{model_name}/infer:
post:
operationId: model-inference
/v2/models/{model_name}/versions/{model_version}/infer:
post:
operationId: model-version-inference
Import
# No code import; reference as OpenAPI spec file:
# docs-gb/apis/inference/open-inference-protocol-v2.openapi.yaml
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| id | string | No | Optional request identifier returned in response |
| inputs | array of RequestInput | Yes | Tensor inputs with name, shape, datatype, and data |
| outputs | array of RequestOutput | No | Optional filter for which outputs to return |
| parameters | Parameters | No | Optional key-value parameters (e.g., content_type, headers) |
Outputs
| Name | Type | Description |
|---|---|---|
| model_name | string | Name of the model that produced the response |
| model_version | string | Version of the model (optional) |
| id | string | Request identifier echoed back |
| outputs | array of ResponseOutput | Tensor outputs with name, shape, datatype, and data |
| parameters | Parameters | Optional response parameters |
Usage Examples
Health Check
# Check if server is live
curl -s http://localhost:9000/v2/health/live
# Check if server is ready
curl -s http://localhost:9000/v2/health/ready
# Check if a specific model is ready
curl -s http://localhost:9000/v2/models/iris/ready
Server and Model Metadata
# Get server metadata
curl -s http://localhost:9000/v2 | jq .
# Get model metadata
curl -s http://localhost:9000/v2/models/iris | jq .
Model Inference
# Inference request
curl -s -X POST http://localhost:9000/v2/models/iris/infer \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{
"name": "predict",
"shape": [1, 4],
"datatype": "FP32",
"data": [[5.1, 3.5, 1.4, 0.2]]
}
]
}' | jq .
# Versioned inference request
curl -s -X POST http://localhost:9000/v2/models/iris/versions/1/infer \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{
"name": "predict",
"shape": [1, 4],
"datatype": "FP32",
"data": [[5.1, 3.5, 1.4, 0.2]]
}
]
}' | jq .