Principle:SeldonIO Seldon core Experiment Traffic Analysis
| Field | Value |
|---|---|
| Overview | Monitoring which candidate model serves each request during an experiment using route headers and traffic distribution analysis. |
| Domains | MLOps, Experimentation |
| Related Implementation | SeldonIO_Seldon_core_Seldon_Model_Infer_With_Headers |
| Last Updated | 2026-02-13 00:00 GMT |
Description
During an active experiment, each inference response includes an x-seldon-route header indicating which candidate served the request. Running multiple iterations produces traffic distribution statistics showing the actual split. Sticky sessions (via the x-seldon-route header sent as a request header) can pin subsequent requests to the same candidate.
Experiment traffic analysis involves three key activities:
- Per-request routing identification: Each response includes the
x-seldon-routeheader and themodel_namefield in the V2 response body, both identifying which candidate processed the request. - Distribution validation: By running multiple inference iterations, the observed traffic split can be compared against the configured weights to verify correct routing behavior.
- Sticky session testing: Passing the
x-seldon-routevalue from a previous response as a request header pins all subsequent requests to the same candidate, enabling stateful experiment interactions.
Theoretical Basis
Experiment monitoring relies on response metadata to track which candidate handled each request. Statistical analysis of the distribution (observed vs expected weights) validates that the traffic split is correct. Sticky sessions enable stateful experiment interactions where a client needs consistent routing.
Key theoretical considerations:
- Observability through headers: The
x-seldon-routeheader provides a non-invasive mechanism for tracking experiment routing. It does not alter the inference response payload, keeping the observation separate from the data. - Law of large numbers: The observed traffic distribution converges to the configured weights as the number of requests increases. A small number of requests may show significant deviation from expected percentages; statistical significance requires sufficient sample size.
- Route pinning (sticky sessions): By echoing the
x-seldon-routeheader back in subsequent requests, clients can ensure all their requests go to the same candidate. This is essential for:- Multi-step inference workflows that require consistency
- Debugging a specific candidate's behavior
- A/B testing scenarios where user experience must be consistent within a session
- Traffic statistics aggregation: The Seldon CLI can aggregate results across multiple iterations, producing summary statistics (e.g.,
map[:iris_1::50 :iris2_1::50]) that show the actual percentage split across candidates.
Usage
This principle applies while an experiment is active, to verify traffic distribution and analyze per-candidate responses. Typical use cases:
- Validation: After starting an experiment, run a batch of inference requests with
--show-headersto confirm that traffic is being split as configured. - Monitoring: Periodically run multi-iteration inferences to track distribution drift or routing anomalies.
- Debugging: Use sticky sessions to pin requests to a specific candidate for targeted debugging.
- Analysis: Compare response payloads across candidates to evaluate model quality differences.
The analysis workflow:
- Send inference requests to the default model endpoint
- Inspect the
x-seldon-routeheader in each response - Run multi-iteration inference to gather distribution statistics
- Compare observed distribution against configured weights
- Optionally use sticky sessions to test specific candidates in isolation
Related Pages
- SeldonIO_Seldon_core_Seldon_Model_Infer_With_Headers — implements this principle — Concrete CLI tool for monitoring experiment traffic distribution using inference with response headers.
- SeldonIO_Seldon_core_Experiment_Execution — prerequisite principle — Activating an experiment to begin traffic splitting or mirroring between model candidates.
- SeldonIO_Seldon_core_Experiment_Lifecycle_Management — next principle — Operational procedures for updating and concluding experiments based on analysis results.
Implementation:SeldonIO_Seldon_core_Seldon_Model_Infer_With_Headers