Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:SeldonIO Seldon core Production Traffic Monitoring

From Leeroopedia
Property Value
Principle Name Production Traffic Monitoring
Overview Sending production inference requests through a monitoring pipeline and analyzing drift, outlier, and prediction outputs
Domains MLOps, Monitoring
Related Implementation SeldonIO_Seldon_core_Seldon_Pipeline_Infer_Monitoring
Knowledge Sources Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated 2026-02-13 00:00 GMT

Description

Production traffic monitoring sends real inference requests through the monitoring pipeline. The pipeline processes each request through multiple paths simultaneously:

  • Classifier produces income predictions (e.g., >50K or <=50K)
  • Outlier detector flags anomalous inputs with per-request binary is_outlier decisions
  • Drift detector aggregates batches of requests to test for distribution shift

Both the Python requests library and the seldon CLI can be used to send inference requests. Prometheus metrics provide aggregate monitoring dashboards for operational visibility.

Theoretical Basis

Continuous monitoring detects model degradation in production by comparing live data distributions against training data references. The monitoring operates at two temporal scales:

Per-Request Monitoring (Outlier Detection)

Outlier detection operates per-request with low latency. Each incoming data point is scored against the OutlierVAE's learned normal manifold. This provides:

  • Immediate feedback on individual request quality
  • Binary flag (is_outlier) included in the inference response
  • Instance-level scores for ranking anomaly severity

Batch Monitoring (Drift Detection)

Drift detection batches requests (e.g., 20 at a time) for higher statistical power but delayed signal. Statistical tests require sufficient sample sizes to achieve reliable p-values:

  • Chi-squared tests need expected frequencies >= 5 per cell for validity
  • KS tests need sample sizes proportional to the desired detection sensitivity
  • Bonferroni correction becomes more conservative with more features

Prometheus Metrics

Prometheus time-series metrics enable trend analysis and alerting:

  • Model prediction distributions over time
  • Outlier detection rates (percentage of flagged requests)
  • Drift detection results per batch
  • Latency and throughput metrics per pipeline step

Usage

Use this principle when monitoring live production traffic for data quality issues, drift, and anomalous inputs. The monitoring flow is:

  1. Send inference requests using the V2 protocol (JSON with FP32 features)
  2. Receive combined responses containing predictions and outlier flags
  3. Monitor drift detection results asynchronously (reported per batch)
  4. Track aggregate metrics via Prometheus dashboards
  5. Set up alerts for sustained drift or elevated outlier rates

Related Pages

Implementation:SeldonIO_Seldon_core_Seldon_Pipeline_Infer_Monitoring

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment