Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:SeldonIO Seldon core Open Inference Protocol V2 OpenAPI

From Leeroopedia
Knowledge Sources
Domains Inference, REST_API, ML_Serving
Last Updated 2026-02-13 14:00 GMT

Overview

OpenAPI 3.0.3 specification defining the V2 Inference Protocol REST endpoints for Seldon Core 2 inference servers.

Description

The Open Inference Protocol V2 OpenAPI specification is the formal contract for all REST interactions with Seldon Core 2 inference servers. It defines health check endpoints (server live, server ready, model ready), metadata retrieval endpoints (server metadata, model metadata), and inference endpoints (model inference with optional version pinning). The specification uses standard OpenAPI 3.0.3 and includes typed request/response schemas for InferenceRequest, InferenceResponse, MetadataServerResponse, MetadataModelResponse, and associated tensor types.

This specification implements the KServe V2 Inference Protocol (formerly known as the Open Inference Protocol), which provides a standardized interface across heterogeneous model servers such as MLServer and NVIDIA Triton.

Usage

Reference this specification when building HTTP clients that interact with Seldon Core 2 inference servers, when validating request/response payloads, or when generating client SDKs for the inference data plane. The spec is also used by the Seldon CLI internally for constructing inference requests.

Code Reference

Source Location

Signature

openapi: 3.0.3
info:
  title: Data Plane
  version: '2.0'
  description: REST protocol to interact with inference servers.

paths:
  /v2/health/live:
    get:
      operationId: server-live
  /v2/health/ready:
    get:
      operationId: server-ready
  /v2/models/{model_name}/ready:
    get:
      operationId: model-ready
  /v2:
    get:
      operationId: server-metadata
  /v2/models/{model_name}:
    get:
      operationId: model-metadata
  /v2/models/{model_name}/versions/{model_version}:
    get:
      operationId: model-version-metadata
  /v2/models/{model_name}/infer:
    post:
      operationId: model-inference
  /v2/models/{model_name}/versions/{model_version}/infer:
    post:
      operationId: model-version-inference

Import

# No code import; reference as OpenAPI spec file:
# docs-gb/apis/inference/open-inference-protocol-v2.openapi.yaml

I/O Contract

Inputs

Name Type Required Description
id string No Optional request identifier returned in response
inputs array of RequestInput Yes Tensor inputs with name, shape, datatype, and data
outputs array of RequestOutput No Optional filter for which outputs to return
parameters Parameters No Optional key-value parameters (e.g., content_type, headers)

Outputs

Name Type Description
model_name string Name of the model that produced the response
model_version string Version of the model (optional)
id string Request identifier echoed back
outputs array of ResponseOutput Tensor outputs with name, shape, datatype, and data
parameters Parameters Optional response parameters

Usage Examples

Health Check

# Check if server is live
curl -s http://localhost:9000/v2/health/live

# Check if server is ready
curl -s http://localhost:9000/v2/health/ready

# Check if a specific model is ready
curl -s http://localhost:9000/v2/models/iris/ready

Server and Model Metadata

# Get server metadata
curl -s http://localhost:9000/v2 | jq .

# Get model metadata
curl -s http://localhost:9000/v2/models/iris | jq .

Model Inference

# Inference request
curl -s -X POST http://localhost:9000/v2/models/iris/infer \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      {
        "name": "predict",
        "shape": [1, 4],
        "datatype": "FP32",
        "data": [[5.1, 3.5, 1.4, 0.2]]
      }
    ]
  }' | jq .

# Versioned inference request
curl -s -X POST http://localhost:9000/v2/models/iris/versions/1/infer \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [
      {
        "name": "predict",
        "shape": [1, 4],
        "datatype": "FP32",
        "data": [[5.1, 3.5, 1.4, 0.2]]
      }
    ]
  }' | jq .

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment