Implementation:SeldonIO Seldon core Seldon Model Infer
| Property | Value |
|---|---|
| Implementation Name | Seldon_Model_Infer |
| Type | External Tool Doc |
| Overview | Concrete CLI tool for sending V2-protocol inference requests to Seldon Core 2 models. |
| Implements Principle | SeldonIO_Seldon_core_V2_Inference_Protocol |
| Workflow | Model_Deployment |
| Domains | MLOps, Inference |
| Source | docs-gb/cli/seldon_model_infer.md:L1-35
|
| External Dependencies | seldon CLI, curl, grpcurl, V2 inference protocol |
| Last Updated | 2026-02-13 00:00 GMT |
Description
The seldon model infer command is the primary CLI interface for sending inference requests to deployed Seldon Core 2 models. It formats and transmits V2 Inference Protocol payloads via REST or gRPC, routing requests through the Seldon envoy proxy to the appropriate inference server. The command supports both single-shot and repeated inference (via the -i iterations flag), custom headers, and configurable inference modes.
Equivalent functionality can be accessed directly through curl (REST) or grpcurl (gRPC) for integration into custom tooling.
Code Reference
Source: docs-gb/cli/seldon_model_infer.md:L1-35
CLI Signature:
seldon model infer <modelName> '<V2_JSON_payload>' [--inference-mode rest|grpc] [--inference-host string]
Alternatives:
# REST via curl
curl -X POST http://<host>/v2/models/<name>/infer \
-H "Content-Type: application/json" \
-d '<V2_JSON_payload>'
# gRPC via grpcurl
grpcurl -d '<protobuf_JSON>' <host>:9000 inference.GRPCInferenceService/ModelInfer
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
modelName |
string (positional) | (required) | Name of the deployed model to run inference against |
data |
string (positional) | (required) | V2 JSON payload containing inputs array |
--inference-mode |
string | "rest" |
Protocol mode: "rest" or "grpc"
|
--inference-host |
string | "0.0.0.0:9000" |
Address of the Seldon inference endpoint |
-i / --iterations |
integer | 1 | Number of times to repeat the inference request |
--header |
string (repeatable) | (none) | Custom HTTP headers to include with the request |
-h / --help |
boolean | false | Display help information for the command |
I/O Contract
Inputs
| Input | Format | Description |
|---|---|---|
| V2 JSON payload | JSON string | {"inputs": [{"name": str, "shape": list[int], "datatype": str, "data": list}]}
|
Input field definitions:
- name: Identifier for the input tensor (model-specific, e.g.,
"predict","input") - shape: Tensor dimensions as a list of integers (e.g.,
[1, 4]for a single sample with 4 features) - datatype: V2 data type string (
FP32,FP64,INT64,BYTES, etc.) - data: The actual tensor data as a nested list matching the declared shape
Outputs
| Output | Format | Description |
|---|---|---|
| V2 JSON response | JSON string | {"model_name": str, "model_version": str, "outputs": [{"name": str, "shape": list, "datatype": str, "data": list}]}
|
Output field definitions:
- model_name: Name of the model that produced the prediction
- model_version: Version of the model that produced the prediction
- outputs: Array of output tensors with the same structure as inputs (name, shape, datatype, data)
Usage Examples
Basic REST Inference
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'
Expected response:
{
"model_name": "iris",
"model_version": "v0.1.0",
"outputs": [
{
"name": "predict",
"shape": [1],
"datatype": "INT64",
"data": [0]
}
]
}
gRPC Inference
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
--inference-mode grpc
Batch Inference with Multiple Samples
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [3, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2], [6.2, 2.8, 4.8, 1.8], [7.0, 3.2, 4.7, 1.4]]}]}'
Repeated Inference (Load Testing)
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
-i 100
Inference via curl
curl -X POST http://localhost:9000/v2/models/iris/infer \
-H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'
Knowledge Sources
- Repository: https://github.com/SeldonIO/seldon-core
- Documentation: https://docs.seldon.io/projects/seldon-core/en/v2/
Related Pages
- SeldonIO_Seldon_core_Seldon_Model_Infer implements SeldonIO_Seldon_core_V2_Inference_Protocol
- SeldonIO_Seldon_core_Seldon_Model_Status precedes SeldonIO_Seldon_core_Seldon_Model_Infer
- SeldonIO_Seldon_core_Seldon_Model_Load is required by SeldonIO_Seldon_core_Seldon_Model_Infer
- Environment:SeldonIO_Seldon_core_Kubernetes_Cluster_Environment
- Environment:SeldonIO_Seldon_core_Docker_Compose_Local_Environment