Principle:SeldonIO Seldon core Production Traffic Monitoring
| Property | Value |
|---|---|
| Principle Name | Production Traffic Monitoring |
| Overview | Sending production inference requests through a monitoring pipeline and analyzing drift, outlier, and prediction outputs |
| Domains | MLOps, Monitoring |
| Related Implementation | SeldonIO_Seldon_core_Seldon_Pipeline_Infer_Monitoring |
| Knowledge Sources | Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/) |
| Last Updated | 2026-02-13 00:00 GMT |
Description
Production traffic monitoring sends real inference requests through the monitoring pipeline. The pipeline processes each request through multiple paths simultaneously:
- Classifier produces income predictions (e.g., >50K or <=50K)
- Outlier detector flags anomalous inputs with per-request binary is_outlier decisions
- Drift detector aggregates batches of requests to test for distribution shift
Both the Python requests library and the seldon CLI can be used to send inference requests. Prometheus metrics provide aggregate monitoring dashboards for operational visibility.
Theoretical Basis
Continuous monitoring detects model degradation in production by comparing live data distributions against training data references. The monitoring operates at two temporal scales:
Per-Request Monitoring (Outlier Detection)
Outlier detection operates per-request with low latency. Each incoming data point is scored against the OutlierVAE's learned normal manifold. This provides:
- Immediate feedback on individual request quality
- Binary flag (is_outlier) included in the inference response
- Instance-level scores for ranking anomaly severity
Batch Monitoring (Drift Detection)
Drift detection batches requests (e.g., 20 at a time) for higher statistical power but delayed signal. Statistical tests require sufficient sample sizes to achieve reliable p-values:
- Chi-squared tests need expected frequencies >= 5 per cell for validity
- KS tests need sample sizes proportional to the desired detection sensitivity
- Bonferroni correction becomes more conservative with more features
Prometheus Metrics
Prometheus time-series metrics enable trend analysis and alerting:
- Model prediction distributions over time
- Outlier detection rates (percentage of flagged requests)
- Drift detection results per batch
- Latency and throughput metrics per pipeline step
Usage
Use this principle when monitoring live production traffic for data quality issues, drift, and anomalous inputs. The monitoring flow is:
- Send inference requests using the V2 protocol (JSON with FP32 features)
- Receive combined responses containing predictions and outlier flags
- Monitor drift detection results asynchronously (reported per batch)
- Track aggregate metrics via Prometheus dashboards
- Set up alerts for sustained drift or elevated outlier rates
Related Pages
- SeldonIO_Seldon_core_Seldon_Pipeline_Infer_Monitoring (implements this principle) - Concrete tools for sending inference requests and monitoring traffic
- SeldonIO_Seldon_core_Monitoring_Pipeline_Validation (prerequisite) - Validating pipeline readiness before sending traffic
- SeldonIO_Seldon_core_Seldon_Pipeline_Load_And_Status (prerequisite) - Deploying and confirming pipeline status
- SeldonIO_Seldon_core_Drift_And_Outlier_Detection_Training (foundation) - How the detectors were trained
- SeldonIO_Seldon_core_Pipeline_Version_Progression (related) - Evolving monitoring capabilities over time
Implementation:SeldonIO_Seldon_core_Seldon_Pipeline_Infer_Monitoring