Implementation:SeldonIO Seldon core Seldon Model Infer With Headers
Appearance
| Field | Value |
|---|---|
| Type | External Tool Doc |
| Overview | Concrete CLI tool for monitoring experiment traffic distribution using inference with response headers in Seldon Core 2. |
| Domains | MLOps, Experimentation |
| Related Principle | SeldonIO_Seldon_core_Experiment_Traffic_Analysis |
| Source | docs-gb/cli/seldon_model_infer.md:L1-35, samples/local-experiments.md:L130-230
|
| Knowledge Sources | Repo, Doc |
| Last Updated | 2026-02-13 00:00 GMT |
Description
This implementation provides the concrete CLI commands for monitoring experiment traffic distribution in Seldon Core 2. The seldon model infer command sends V2 inference requests to a model endpoint and can display response headers (including x-seldon-route), run multiple iterations for distribution analysis, and use sticky sessions for route pinning.
Code Reference
CLI Signature
seldon model infer <modelName> '<data>' [--show-headers] [-i iterations] [-t seconds] [-s sticky-session] [--header key=value]
Single Inference with Headers
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
--show-headers
Example response:
# Headers:
# x-seldon-route: iris2_1
# Response:
{
"model_name": "iris2_1",
"outputs": [{"name": "predict", "shape": [1, 1], "datatype": "FP64", "data": [2]}]
}
Multi-Iteration Distribution Analysis
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
-i 100
Example output:
Success: map[:iris_1::50 :iris2_1::50]
I/O Contract
| Direction | Description |
|---|---|
| Inputs | V2 inference payload (JSON), experiment default model name or <experiment-name>.experiment as the target endpoint.
|
| Outputs | V2 inference responses with model_name field showing which candidate served the request. Response header x-seldon-route identifying the routed candidate. Multi-iteration mode produces traffic statistics (e.g., "Success: map[:iris_1::50 :iris2_1::50]").
|
Key Parameters
| Parameter | Description | Default | Required |
|---|---|---|---|
modelName (positional) |
Target model name (default model or <experiment>.experiment) |
— | Yes |
data (positional) |
V2 inference payload as JSON string | — | Yes |
--show-headers |
Display response headers including x-seldon-route |
false |
No |
-i / --iterations |
Number of inference requests to send (for distribution analysis) | 1 |
No |
-t / --seconds |
Run inferences for a specified duration (seconds) | — | No |
-s / --sticky-session |
Enable sticky session; reuse the route from the first response | false |
No |
--header |
Pass custom headers (e.g., x-seldon-route=iris2_1 for route pinning) |
— | No |
--inference-host |
Inference server address | 0.0.0.0:9000 |
No |
Usage Examples
Verify Traffic Split After Starting Experiment
# Run 100 iterations to verify 50/50 split
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
-i 100
# Expected output (approximately):
# Success: map[:iris_1::50 :iris2_1::50]
Sticky Session: Pin to a Specific Candidate
# First request: discover which candidate was assigned
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
--show-headers
# Subsequent requests: pin to iris2_1 using the route header
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
--header x-seldon-route=iris2_1 \
-s
Timed Distribution Analysis
# Run inferences for 30 seconds and report distribution
seldon model infer iris \
'{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}' \
-t 30
Using curl for Direct HTTP Inference
# Direct HTTP inference with header inspection
curl -v http://localhost:9000/v2/models/iris/infer \
-H "Content-Type: application/json" \
-d '{"inputs": [{"name": "predict", "shape": [1, 4], "datatype": "FP32", "data": [[1, 2, 3, 4]]}]}'
# The response headers will include:
# x-seldon-route: iris_1 (or iris2_1)
External Dependencies
- seldon CLI — Command-line tool for inference and experiment interaction
- curl — Alternative HTTP client for direct V2 inference protocol requests
- V2 inference protocol — Open Inference Protocol for request/response format
Related Pages
- SeldonIO_Seldon_core_Experiment_Traffic_Analysis — principle for this implementation — Monitoring which candidate model serves each request during an experiment using route headers and traffic distribution analysis.
- SeldonIO_Seldon_core_Seldon_Experiment_Start — prerequisite — Starting the experiment that enables traffic routing.
- SeldonIO_Seldon_core_Seldon_Experiment_Stop — next step — Stopping the experiment after analysis is complete.
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment