Implementation:SeldonIO Seldon core Seldon Pipeline Infer

Field	Value
Implementation Name	Seldon Pipeline Infer
Type	External Tool Doc
Overview	Concrete CLI tool for sending V2-protocol inference requests to Seldon Core 2 pipelines.
Related Principle	SeldonIO_Seldon_core_Pipeline_Inference_Execution
Source	docs-gb/cli/seldon_pipeline_infer.md:L1-35
Domains	MLOps, Inference
External Dependencies	seldon CLI, curl, grpcurl, V2 protocol, Kafka
Knowledge Sources	Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated	2026-02-13 00:00 GMT

Description

The seldon pipeline infer command sends a V2 Inference Protocol request to a deployed Seldon Core 2 pipeline and returns the prediction output. It supports both REST and gRPC inference modes, repeated iterations for benchmarking, and custom headers. The data flows through the pipeline's DAG steps via Kafka, and the response contains the output tensors from the designated output steps.

Code Reference

CLI Signature

seldon pipeline infer <pipelineName> (data) [flags]

CLI Options

Flag	Description	Default
`pipelineName`	Name of the pipeline to call (positional)	(required)
`data`	V2 JSON payload (positional)	(required unless `-f` used)
`-f, --file-path`	Inference payload file	(none)
`--inference-mode`	Inference mode: `rest` or `grpc`	`rest`
`--inference-host`	Seldon inference host	`0.0.0.0:9000`
`-i, --iterations`	How many times to run inference	`1`
`-t, --seconds`	Number of seconds to run inference	(none)
`--header`	Add a header (key=value); repeatable	(none)
`--show-headers`	Show request and response headers	`false`
`-r, --show-request`	Show the request payload	`false`
`-o, --show-response`	Show the response payload	`true`
`-s, --sticky-session`	Use sticky session from last inference	`false`

Source: docs-gb/cli/seldon_pipeline_infer.md:L1-35

I/O Contract

Inputs

V2 JSON payload: A JSON object following the V2 Inference Protocol schema:

{
  "inputs": [
    {
      "name": "INPUT0",
      "data": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
      "datatype": "INT32",
      "shape": [1, 16]
    },
    {
      "name": "INPUT1",
      "data": [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
      "datatype": "INT32",
      "shape": [1, 16]
    }
  ]
}

Each input tensor requires:

name (string): Tensor name matching the first step's expected input.
data (list): Flattened tensor data values.
datatype (string): Data type (e.g., INT32, FP32, BYTES).
shape (list of int): Tensor dimensions.

Outputs

V2 JSON response with outputs from the final pipeline step(s):

{
  "model_name": "",
  "outputs": [
    {
      "data": [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32],
      "name": "OUTPUT0",
      "shape": [1, 16],
      "datatype": "INT32"
    },
    {
      "data": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
      "name": "OUTPUT1",
      "shape": [1, 16],
      "datatype": "INT32"
    }
  ]
}

Usage Examples

Basic Pipeline Inference

seldon pipeline infer tfsimples \
    '{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' | jq -M .

Inference via REST (curl)

curl -X POST http://0.0.0.0:9000/v2/pipelines/tfsimples/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' | jq -M .

Inference via gRPC

seldon pipeline infer tfsimples \
    '{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' \
    --inference-mode grpc

Repeated Inference for Benchmarking

seldon pipeline infer tfsimples \
    '{"inputs":[{"name":"INPUT0","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]},{"name":"INPUT1","data":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],"datatype":"INT32","shape":[1,16]}]}' \
    -i 100

Inspecting Pipeline Data Flow

# Trace data through all pipeline steps for debugging
seldon pipeline inspect tfsimples

Inference from a File

seldon pipeline infer tfsimples -f ./payload.json

Related Pages

SeldonIO_Seldon_core_Pipeline_Inference_Execution - implements - Principle of dataflow-based inference through multi-step pipelines.
SeldonIO_Seldon_core_Seldon_Pipeline_Status - prerequisite - Pipeline must be confirmed ready before inference.
SeldonIO_Seldon_core_Seldon_Pipeline_CRD - defines topology - The pipeline CRD determines how the inference request flows through steps.
SeldonIO_Seldon_core_Seldon_Pipeline_Advanced_Routing - routing logic - Conditional routing patterns that affect inference flow.
Environment:SeldonIO_Seldon_core_Kubernetes_Cluster_Environment
Environment:SeldonIO_Seldon_core_Docker_Compose_Local_Environment
Environment:SeldonIO_Seldon_core_Kafka_Messaging_Environment
Heuristic:SeldonIO_Seldon_core_Kafka_Partition_Throughput_Tip

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment