Implementation:SeldonIO Seldon core Seldon Model Infer

Property	Value
Implementation Name	Seldon_Model_Infer
Type	External Tool Doc
Overview	Concrete CLI tool for sending V2-protocol inference requests to Seldon Core 2 models.
Implements Principle	SeldonIO_Seldon_core_V2_Inference_Protocol
Workflow	Model_Deployment
Domains	MLOps, Inference
Source	`docs-gb/cli/seldon_model_infer.md:L1-35`
External Dependencies	seldon CLI, curl, grpcurl, V2 inference protocol
Last Updated	2026-02-13 00:00 GMT

Description

The seldon model infer command is the primary CLI interface for sending inference requests to deployed Seldon Core 2 models. It formats and transmits V2 Inference Protocol payloads via REST or gRPC, routing requests through the Seldon envoy proxy to the appropriate inference server. The command supports both single-shot and repeated inference (via the -i iterations flag), custom headers, and configurable inference modes.

Equivalent functionality can be accessed directly through curl (REST) or grpcurl (gRPC) for integration into custom tooling.

Code Reference

Source: docs-gb/cli/seldon_model_infer.md:L1-35

CLI Signature:

seldon model infer <modelName> '<V2_JSON_payload>' [--inference-mode rest|grpc] [--inference-host string]

Alternatives:

# REST via curl
curl -X POST http://<host>/v2/models/<name>/infer \
  -H "Content-Type: application/json" \
  -d '<V2_JSON_payload>'

# gRPC via grpcurl
grpcurl -d '<protobuf_JSON>' <host>:9000 inference.GRPCInferenceService/ModelInfer

Key Parameters

Parameter	Type	Default	Description
`modelName`	string (positional)	(required)	Name of the deployed model to run inference against
`data`	string (positional)	(required)	V2 JSON payload containing inputs array
`--inference-mode`	string	`"rest"`	Protocol mode: `"rest"` or `"grpc"`
`--inference-host`	string	`"0.0.0.0:9000"`	Address of the Seldon inference endpoint
`-i` / `--iterations`	integer	1	Number of times to repeat the inference request
`--header`	string (repeatable)	(none)	Custom HTTP headers to include with the request
`-h` / `--help`	boolean	false	Display help information for the command

I/O Contract

Inputs

Input	Format	Description
V2 JSON payload	JSON string	`{"inputs": [{"name": str, "shape": list[int], "datatype": str, "data": list}]}`

Input field definitions:

name: Identifier for the input tensor (model-specific, e.g., "predict", "input")
shape: Tensor dimensions as a list of integers (e.g., [1, 4] for a single sample with 4 features)
datatype: V2 data type string (FP32, FP64, INT64, BYTES, etc.)
data: The actual tensor data as a nested list matching the declared shape

Outputs

Output	Format	Description
V2 JSON response	JSON string	`{"model_name": str, "model_version": str, "outputs": [{"name": str, "shape": list, "datatype": str, "data": list}]}`

Output field definitions:

model_name: Name of the model that produced the prediction
model_version: Version of the model that produced the prediction
outputs: Array of output tensors with the same structure as inputs (name, shape, datatype, data)

Usage Examples

Basic REST Inference

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'

Expected response:

{
  "model_name": "iris",
  "model_version": "v0.1.0",
  "outputs": [
    {
      "name": "predict",
      "shape": [1],
      "datatype": "INT64",
      "data": [0]
    }
  ]
}

gRPC Inference

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
  --inference-mode grpc

Batch Inference with Multiple Samples

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [3, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2], [6.2, 2.8, 4.8, 1.8], [7.0, 3.2, 4.7, 1.4]]}]}'

Repeated Inference (Load Testing)

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}' \
  -i 100

Inference via curl

curl -X POST http://localhost:9000/v2/models/iris/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[5.1, 3.5, 1.4, 0.2]]}]}'

Knowledge Sources

Repository: https://github.com/SeldonIO/seldon-core
Documentation: https://docs.seldon.io/projects/seldon-core/en/v2/

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment