Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:SeldonIO Seldon core Seldon Model Infer

From Leeroopedia
Property Value
Implementation Name Seldon_Model_Infer
Type External Tool Doc
Overview Concrete CLI tool for sending V2-protocol inference requests to Seldon Core 2 models.
Implements Principle SeldonIO_Seldon_core_V2_Inference_Protocol
Workflow Model_Deployment
Domains MLOps, Inference
Source docs-gb/cli/seldon_model_infer.md:L1-35
External Dependencies seldon CLI, curl, grpcurl, V2 inference protocol
Last Updated 2026-02-13 00:00 GMT

Description

The seldon model infer command is the primary CLI interface for sending inference requests to deployed Seldon Core 2 models. It formats and transmits V2 Inference Protocol payloads via REST or gRPC, routing requests through the Seldon envoy proxy to the appropriate inference server. The command supports both single-shot and repeated inference (via the -i iterations flag), custom headers, and configurable inference modes.

Equivalent functionality can be accessed directly through curl (REST) or grpcurl (gRPC) for integration into custom tooling.

Code Reference

Source: docs-gb/cli/seldon_model_infer.md:L1-35

CLI Signature:

seldon model infer <modelName> '<V2_JSON_payload>' [--inference-mode rest|grpc] [--inference-host string]

Alternatives:

# REST via curl
curl -X POST http://<host>/v2/models/<name>/infer \
  -H "Content-Type: application/json" \
  -d '<V2_JSON_payload>'

# gRPC via grpcurl
grpcurl -d '<protobuf_JSON>' <host>:9000 inference.GRPCInferenceService/ModelInfer

Key Parameters

Parameter Type Default Description
modelName string (positional) (required) Name of the deployed model to run inference against
data string (positional) (required) V2 JSON payload containing inputs array
--inference-mode string "rest" Protocol mode: "rest" or "grpc"
--inference-host string "0.0.0.0:9000" Address of the Seldon inference endpoint
-i / --iterations integer 1 Number of times to repeat the inference request
--header string (repeatable) (none) Custom HTTP headers to include with the request
-h / --help boolean false Display help information for the command

I/O Contract

Inputs

Input Format Description
V2 JSON payload JSON string {"inputs": [{"name": str, "shape": list[int], "datatype": str, "data": list}]}

Input field definitions:

  • name: Identifier for the input tensor (model-specific, e.g., "predict", "input")
  • shape: Tensor dimensions as a list of integers (e.g., [1, 4] for a single sample with 4 features)
  • datatype: V2 data type string (FP32, FP64, INT64, BYTES, etc.)
  • data: The actual tensor data as a nested list matching the declared shape

Outputs

Output Format Description
V2 JSON response JSON string {"model_name": str, "model_version": str, "outputs": [{"name": str, "shape": list, "datatype": str, "data": list}]}

Output field definitions:

  • model_name: Name of the model that produced the prediction
  • model_version: Version of the model that produced the prediction
  • outputs: Array of output tensors with the same structure as inputs (name, shape, datatype, data)

Usage Examples

Basic REST Inference

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'

Expected response:

{
  "model_name": "iris",
  "model_version": "v0.1.0",
  "outputs": [
    {
      "name": "predict",
      "shape": [1],
      "datatype": "INT64",
      "data": [0]
    }
  ]
}

gRPC Inference

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
  --inference-mode grpc

Batch Inference with Multiple Samples

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [3, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2], [6.2, 2.8, 4.8, 1.8], [7.0, 3.2, 4.7, 1.4]]}]}'

Repeated Inference (Load Testing)

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
  -i 100

Inference via curl

curl -X POST http://localhost:9000/v2/models/iris/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'

Knowledge Sources

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment