Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:SeldonIO Seldon core Experiment Traffic Analysis

From Leeroopedia
Field Value
Overview Monitoring which candidate model serves each request during an experiment using route headers and traffic distribution analysis.
Domains MLOps, Experimentation
Related Implementation SeldonIO_Seldon_core_Seldon_Model_Infer_With_Headers
Last Updated 2026-02-13 00:00 GMT

Description

During an active experiment, each inference response includes an x-seldon-route header indicating which candidate served the request. Running multiple iterations produces traffic distribution statistics showing the actual split. Sticky sessions (via the x-seldon-route header sent as a request header) can pin subsequent requests to the same candidate.

Experiment traffic analysis involves three key activities:

  • Per-request routing identification: Each response includes the x-seldon-route header and the model_name field in the V2 response body, both identifying which candidate processed the request.
  • Distribution validation: By running multiple inference iterations, the observed traffic split can be compared against the configured weights to verify correct routing behavior.
  • Sticky session testing: Passing the x-seldon-route value from a previous response as a request header pins all subsequent requests to the same candidate, enabling stateful experiment interactions.

Theoretical Basis

Experiment monitoring relies on response metadata to track which candidate handled each request. Statistical analysis of the distribution (observed vs expected weights) validates that the traffic split is correct. Sticky sessions enable stateful experiment interactions where a client needs consistent routing.

Key theoretical considerations:

  • Observability through headers: The x-seldon-route header provides a non-invasive mechanism for tracking experiment routing. It does not alter the inference response payload, keeping the observation separate from the data.
  • Law of large numbers: The observed traffic distribution converges to the configured weights as the number of requests increases. A small number of requests may show significant deviation from expected percentages; statistical significance requires sufficient sample size.
  • Route pinning (sticky sessions): By echoing the x-seldon-route header back in subsequent requests, clients can ensure all their requests go to the same candidate. This is essential for:
    • Multi-step inference workflows that require consistency
    • Debugging a specific candidate's behavior
    • A/B testing scenarios where user experience must be consistent within a session
  • Traffic statistics aggregation: The Seldon CLI can aggregate results across multiple iterations, producing summary statistics (e.g., map[:iris_1::50 :iris2_1::50]) that show the actual percentage split across candidates.

Usage

This principle applies while an experiment is active, to verify traffic distribution and analyze per-candidate responses. Typical use cases:

  • Validation: After starting an experiment, run a batch of inference requests with --show-headers to confirm that traffic is being split as configured.
  • Monitoring: Periodically run multi-iteration inferences to track distribution drift or routing anomalies.
  • Debugging: Use sticky sessions to pin requests to a specific candidate for targeted debugging.
  • Analysis: Compare response payloads across candidates to evaluate model quality differences.

The analysis workflow:

  1. Send inference requests to the default model endpoint
  2. Inspect the x-seldon-route header in each response
  3. Run multi-iteration inference to gather distribution statistics
  4. Compare observed distribution against configured weights
  5. Optionally use sticky sessions to test specific candidates in isolation

Related Pages

Implementation:SeldonIO_Seldon_core_Seldon_Model_Infer_With_Headers

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment