Implementation:SeldonIO Seldon core Open Inference Protocol V2 OpenAPI

Knowledge Sources	SeldonIO_Seldon_core Open Inference Protocol
Domains	Inference, REST_API, ML_Serving
Last Updated	2026-02-13 14:00 GMT

Overview

OpenAPI 3.0.3 specification defining the V2 Inference Protocol REST endpoints for Seldon Core 2 inference servers.

Description

The Open Inference Protocol V2 OpenAPI specification is the formal contract for all REST interactions with Seldon Core 2 inference servers. It defines health check endpoints (server live, server ready, model ready), metadata retrieval endpoints (server metadata, model metadata), and inference endpoints (model inference with optional version pinning). The specification uses standard OpenAPI 3.0.3 and includes typed request/response schemas for InferenceRequest, InferenceResponse, MetadataServerResponse, MetadataModelResponse, and associated tensor types.

This specification implements the KServe V2 Inference Protocol (formerly known as the Open Inference Protocol), which provides a standardized interface across heterogeneous model servers such as MLServer and NVIDIA Triton.

Usage

Reference this specification when building HTTP clients that interact with Seldon Core 2 inference servers, when validating request/response payloads, or when generating client SDKs for the inference data plane. The spec is also used by the Seldon CLI internally for constructing inference requests.

Code Reference

Source Location

Repository: SeldonIO_Seldon_core
File: docs-gb/apis/inference/open-inference-protocol-v2.openapi.yaml
Lines: 1-743

Signature

openapi: 3.0.3
info:
  title: Data Plane
  version: '2.0'
  description: REST protocol to interact with inference servers.

paths:
  /v2/health/live:
    get:
      operationId: server-live
  /v2/health/ready:
    get:
      operationId: server-ready
  /v2/models/{model_name}/ready:
    get:
      operationId: model-ready
  /v2:
    get:
      operationId: server-metadata
  /v2/models/{model_name}:
    get:
      operationId: model-metadata
  /v2/models/{model_name}/versions/{model_version}:
    get:
      operationId: model-version-metadata
  /v2/models/{model_name}/infer:
    post:
      operationId: model-inference
  /v2/models/{model_name}/versions/{model_version}/infer:
    post:
      operationId: model-version-inference

Import

# No code import; reference as OpenAPI spec file:
# docs-gb/apis/inference/open-inference-protocol-v2.openapi.yaml

I/O Contract

Inputs

Name	Type	Required	Description
id	string	No	Optional request identifier returned in response
inputs	array of RequestInput	Yes	Tensor inputs with name, shape, datatype, and data
outputs	array of RequestOutput	No	Optional filter for which outputs to return
parameters	Parameters	No	Optional key-value parameters (e.g., content_type, headers)

Outputs

Name	Type	Description
model_name	string	Name of the model that produced the response
model_version	string	Version of the model (optional)
id	string	Request identifier echoed back
outputs	array of ResponseOutput	Tensor outputs with name, shape, datatype, and data
parameters	Parameters	Optional response parameters

Usage Examples

Health Check

# Check if server is live
curl -s http://localhost:9000/v2/health/live

# Check if server is ready
curl -s http://localhost:9000/v2/health/ready

# Check if a specific model is ready
curl -s http://localhost:9000/v2/models/iris/ready

Server and Model Metadata

# Get server metadata
curl -s http://localhost:9000/v2 | jq .

# Get model metadata
curl -s http://localhost:9000/v2/models/iris | jq .

Model Inference

# Inference request
curl -s -X POST http://localhost:9000/v2/models/iris/infer \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      {
        "name": "predict",
        "shape": [1, 4],
        "datatype": "FP32",
        "data": [[5.1, 3.5, 1.4, 0.2]]
      }
    ]
  }' | jq .

# Versioned inference request
curl -s -X POST http://localhost:9000/v2/models/iris/versions/1/infer \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      {
        "name": "predict",
        "shape": [1, 4],
        "datatype": "FP32",
        "data": [[5.1, 3.5, 1.4, 0.2]]
      }
    ]
  }' | jq .

Related Pages

Principle:SeldonIO_Seldon_core_V2_Inference_Protocol

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment