Implementation:SeldonIO Seldon core Seldon Pipeline Infer
Appearance
| Field | Value |
|---|---|
| Implementation Name | Seldon Pipeline Infer |
| Type | External Tool Doc |
| Overview | Concrete CLI tool for sending V2-protocol inference requests to Seldon Core 2 pipelines. |
| Related Principle | SeldonIO_Seldon_core_Pipeline_Inference_Execution |
| Source | docs-gb/cli/seldon_pipeline_infer.md:L1-35 |
| Domains | MLOps, Inference |
| External Dependencies | seldon CLI, curl, grpcurl, V2 protocol, Kafka |
| Knowledge Sources | Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/) |
| Last Updated | 2026-02-13 00:00 GMT |
Description
The seldon pipeline infer command sends a V2 Inference Protocol request to a deployed Seldon Core 2 pipeline and returns the prediction output. It supports both REST and gRPC inference modes, repeated iterations for benchmarking, and custom headers. The data flows through the pipeline's DAG steps via Kafka, and the response contains the output tensors from the designated output steps.
Code Reference
CLI Signature
seldon pipeline infer <pipelineName> (data) [flags]
CLI Options
| Flag | Description | Default |
|---|---|---|
pipelineName |
Name of the pipeline to call (positional) | (required) |
data |
V2 JSON payload (positional) | (required unless -f used)
|
-f, --file-path |
Inference payload file | (none) |
--inference-mode |
Inference mode: rest or grpc |
rest
|
--inference-host |
Seldon inference host | 0.0.0.0:9000
|
-i, --iterations |
How many times to run inference | 1
|
-t, --seconds |
Number of seconds to run inference | (none) |
--header |
Add a header (key=value); repeatable | (none) |
--show-headers |
Show request and response headers | false
|
-r, --show-request |
Show the request payload | false
|
-o, --show-response |
Show the response payload | true
|
-s, --sticky-session |
Use sticky session from last inference | false
|
Source: docs-gb/cli/seldon_pipeline_infer.md:L1-35
I/O Contract
Inputs
- V2 JSON payload: A JSON object following the V2 Inference Protocol schema:
{
"inputs": [
{
"name": "INPUT0",
"data": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
"datatype": "INT32",
"shape": [1, 16]
},
{
"name": "INPUT1",
"data": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
"datatype": "INT32",
"shape": [1, 16]
}
]
}
Each input tensor requires:
name(string): Tensor name matching the first step's expected input.data(list): Flattened tensor data values.datatype(string): Data type (e.g.,INT32,FP32,BYTES).shape(list of int): Tensor dimensions.
Outputs
- V2 JSON response with outputs from the final pipeline step(s):
{
"model_name": "",
"outputs": [
{
"data": [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32],
"name": "OUTPUT0",
"shape": [1, 16],
"datatype": "INT32"
},
{
"data": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
"name": "OUTPUT1",
"shape": [1, 16],
"datatype": "INT32"
}
]
}
Usage Examples
Basic Pipeline Inference
seldon pipeline infer tfsimples \
'{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' | jq -M .
Inference via REST (curl)
curl -X POST http://0.0.0.0:9000/v2/pipelines/tfsimples/infer \
-H "Content-Type: application/json" \
-d '{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' | jq -M .
Inference via gRPC
seldon pipeline infer tfsimples \
'{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' \
--inference-mode grpc
Repeated Inference for Benchmarking
seldon pipeline infer tfsimples \
'{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' \
-i 100
Inspecting Pipeline Data Flow
# Trace data through all pipeline steps for debugging
seldon pipeline inspect tfsimples
Inference from a File
seldon pipeline infer tfsimples -f ./payload.json
Related Pages
- SeldonIO_Seldon_core_Pipeline_Inference_Execution - implements - Principle of dataflow-based inference through multi-step pipelines.
- SeldonIO_Seldon_core_Seldon_Pipeline_Status - prerequisite - Pipeline must be confirmed ready before inference.
- SeldonIO_Seldon_core_Seldon_Pipeline_CRD - defines topology - The pipeline CRD determines how the inference request flows through steps.
- SeldonIO_Seldon_core_Seldon_Pipeline_Advanced_Routing - routing logic - Conditional routing patterns that affect inference flow.
- Environment:SeldonIO_Seldon_core_Kubernetes_Cluster_Environment
- Environment:SeldonIO_Seldon_core_Docker_Compose_Local_Environment
- Environment:SeldonIO_Seldon_core_Kafka_Messaging_Environment
- Heuristic:SeldonIO_Seldon_core_Kafka_Partition_Throughput_Tip
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment