Implementation:SeldonIO Seldon core Seldon Model Infer With Headers

Field	Value
Type	External Tool Doc
Overview	Concrete CLI tool for monitoring experiment traffic distribution using inference with response headers in Seldon Core 2.
Domains	MLOps, Experimentation
Related Principle	SeldonIO_Seldon_core_Experiment_Traffic_Analysis
Source	`docs-gb/cli/seldon_model_infer.md:L1-35`, `samples/local-experiments.md:L130-230`
Knowledge Sources	Repo, Doc
Last Updated	2026-02-13 00:00 GMT

Description

This implementation provides the concrete CLI commands for monitoring experiment traffic distribution in Seldon Core 2. The seldon model infer command sends V2 inference requests to a model endpoint and can display response headers (including x-seldon-route), run multiple iterations for distribution analysis, and use sticky sessions for route pinning.

Code Reference

CLI Signature

seldon model infer <modelName> '<data>' [--show-headers] [-i iterations] [-t seconds] [-s sticky-session] [--header key=value]

Single Inference with Headers

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  --show-headers

Example response:

# Headers:
#   x-seldon-route: iris2_1
# Response:
{
  "model_name": "iris2_1",
  "outputs": [{"name": "predict", "shape": [1, 1], "datatype": "FP64", "data": [2]}]
}

Multi-Iteration Distribution Analysis

seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  -i 100

Example output:

Success: map[:iris_1::50 :iris2_1::50]

I/O Contract

Direction	Description
Inputs	V2 inference payload (JSON), experiment default model name or `<experiment-name>.experiment` as the target endpoint.
Outputs	V2 inference responses with `model_name` field showing which candidate served the request. Response header `x-seldon-route` identifying the routed candidate. Multi-iteration mode produces traffic statistics (e.g., `"Success: map[:iris_1::50 :iris2_1::50]"`).

Key Parameters

Parameter	Description	Default	Required
`modelName` (positional)	Target model name (default model or `<experiment>.experiment`)	—	Yes
`data` (positional)	V2 inference payload as JSON string	—	Yes
`--show-headers`	Display response headers including `x-seldon-route`	`false`	No
`-i / --iterations`	Number of inference requests to send (for distribution analysis)	`1`	No
`-t / --seconds`	Run inferences for a specified duration (seconds)	—	No
`-s / --sticky-session`	Enable sticky session; reuse the route from the first response	`false`	No
`--header`	Pass custom headers (e.g., `x-seldon-route=iris2_1` for route pinning)	—	No
`--inference-host`	Inference server address	`0.0.0.0:9000`	No

Usage Examples

Verify Traffic Split After Starting Experiment

# Run 100 iterations to verify 50/50 split
seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  -i 100

# Expected output (approximately):
# Success: map[:iris_1::50 :iris2_1::50]

Sticky Session: Pin to a Specific Candidate

# First request: discover which candidate was assigned
seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  --show-headers

# Subsequent requests: pin to iris2_1 using the route header
seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  --header x-seldon-route=iris2_1 \
  -s

Timed Distribution Analysis

# Run inferences for 30 seconds and report distribution
seldon model infer iris \
  '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
  -t 30

Using curl for Direct HTTP Inference

# Direct HTTP inference with header inspection
curl -v http://localhost:9000/v2/models/iris/infer \
  -H "Content-Type: application/json" \
  -d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'

# The response headers will include:
# x-seldon-route: iris_1  (or iris2_1)

External Dependencies

seldon CLI — Command-line tool for inference and experiment interaction
curl — Alternative HTTP client for direct V2 inference protocol requests
V2 inference protocol — Open Inference Protocol for request/response format

Related Pages

SeldonIO_Seldon_core_Experiment_Traffic_Analysis — principle for this implementation — Monitoring which candidate model serves each request during an experiment using route headers and traffic distribution analysis.
SeldonIO_Seldon_core_Seldon_Experiment_Start — prerequisite — Starting the experiment that enables traffic routing.
SeldonIO_Seldon_core_Seldon_Experiment_Stop — next step — Stopping the experiment after analysis is complete.

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment