Principle:SeldonIO Seldon core Monitoring Pipeline Definition

Property	Value
Principle Name	Monitoring Pipeline Definition
Overview	Composing classifier, preprocessor, drift detector, and outlier detector into a unified monitoring pipeline with batch processing
Domains	MLOps, Data_Flow
Related Implementation	SeldonIO_Seldon_core_Seldon_Pipeline_CRD_Monitoring
Knowledge Sources	Repo (https://github.com/SeldonIO/seldon-core), Doc (https://docs.seldon.io/projects/seldon-core/en/v2/)
Last Updated	2026-02-13 00:00 GMT

Description

A monitoring pipeline chains multiple models with different roles:

Classifier (income) - Receives raw input directly and produces predictions
Preprocessor (income-preprocess) - Receives raw input and transforms features for downstream detectors
Outlier Detector (income-outlier) - Consumes preprocessed features and flags anomalous inputs per-request
Drift Detector (income-drift) - Receives raw input and aggregates requests in batches before running statistical tests

The pipeline outputs both predictions from the classifier and outlier flags from the outlier detector. The drift detector runs asynchronously in batches and reports results separately.

Theoretical Basis

Monitoring pipelines extend inference pipelines with batch aggregation for statistical tests. The key design considerations are:

Batch Aggregation for Drift Detection

Drift detection requires batches of samples (e.g., 20) because per-sample drift testing lacks statistical power. A single data point cannot meaningfully indicate whether the overall distribution has shifted. The Kolmogorov-Smirnov test and chi-squared test require sufficient sample sizes to achieve reliable p-values.

The batch size represents a trade-off:

Smaller batches (e.g., 5-10): Faster detection but higher false positive rate
Larger batches (e.g., 50-100): More reliable detection but delayed signal
Typical batch (e.g., 20): Reasonable balance for most production scenarios

Dependency Chains

The outlier detector chains after preprocessing to work on the same feature space used during training. Raw input features may include categorical encodings and un-normalized values that the OutlierVAE was not trained on. The preprocessor applies the same transformations used during training (imputation, scaling, one-hot encoding) to ensure the outlier detector sees consistent feature representations.

Multi-Output Design

The pipeline produces multiple outputs from different steps:

Classifier predictions - The primary inference result
Outlier flags - Per-request binary outlier indicator (is_outlier field)

This allows downstream consumers to receive both the prediction and its trustworthiness assessment in a single response.

Usage

Use this principle when defining a pipeline that combines inference with real-time drift and outlier monitoring. The monitoring pipeline definition requires:

All four component models are deployed and available
The dependency graph is defined (outlier depends on preprocessor output)
Batch sizes are configured for drift detection
Output steps specify which results to return to the caller

Related Pages

SeldonIO_Seldon_core_Seldon_Pipeline_CRD_Monitoring (implements this principle) - Concrete pattern for declaring monitoring pipelines
SeldonIO_Seldon_core_Monitoring_Component_Deployment (prerequisite) - Deploying the four component models
SeldonIO_Seldon_core_Seldon_Model_Load_For_Monitoring (prerequisite) - Loading component models via CLI
SeldonIO_Seldon_core_Monitoring_Pipeline_Validation (next step) - Validating the deployed monitoring pipeline
SeldonIO_Seldon_core_Production_Traffic_Monitoring (uses pipeline) - Sending live traffic through the monitoring pipeline

Implementation:SeldonIO_Seldon_core_Seldon_Pipeline_CRD_Monitoring

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment