Principle:SeldonIO Seldon core Candidate Model Deployment
| Field | Value |
|---|---|
| Overview | Deploying multiple model variants as candidates for A/B testing or traffic mirroring experiments. |
| Domains | MLOps, Experimentation |
| Related Implementation | SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment |
| Last Updated | 2026-02-13 00:00 GMT |
Description
Before starting an experiment, all candidate model variants must be deployed and available. Each variant is a separate Model resource with its own artifact and configuration. Candidates can be different model versions, algorithms, or configurations that will receive traffic according to experiment weights.
In Seldon Core 2, a candidate model is defined as a standard Model custom resource (CRD) with:
- A unique
metadata.nameidentifying the candidate - A
spec.storageUripointing to the model artifact (e.g., on GCS, S3, or local storage) - A
spec.requirementslist specifying runtime dependencies (e.g.,sklearn,xgboost)
Each candidate model is loaded independently through the Seldon scheduler, which provisions the necessary inference server and makes the model available for serving. All candidates must reach a ready state before an experiment can begin routing traffic to them.
Theoretical Basis
A/B testing requires independent, isolated model deployments so that each candidate's predictions are produced under identical runtime conditions. Statistical validity requires that the only variable is the model itself, not the serving infrastructure.
Key principles underlying candidate model deployment for experimentation:
- Isolation: Each candidate runs in its own inference server instance. This prevents resource contention and ensures that latency or throughput differences are attributable to the model, not to shared infrastructure.
- Reproducibility: Candidates are defined declaratively via YAML manifests, making deployments repeatable and version-controllable.
- Independence: Candidates are loaded and unloaded independently. Failure of one candidate does not affect other candidates or the default model.
- Homogeneous Runtime: All candidates use the same inference protocol (V2/Open Inference Protocol), ensuring that the request format and response schema are consistent across variants.
Usage
This principle applies when preparing model variants for comparison in an A/B test or canary deployment. Typical scenarios include:
- Comparing a new model version against the current production model
- Evaluating different algorithms trained on the same dataset
- Testing different hyperparameter configurations
- Running a canary deployment where a small percentage of traffic goes to a new model before full rollout
The workflow is:
- Define each candidate as a separate
ModelCRD YAML manifest - Load each candidate using
seldon model loadorkubectl apply - Verify each candidate reaches a ready state using
seldon model status - Proceed to create and start an Experiment that references the candidates
Related Pages
- SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment — implements this principle — Concrete CLI tool for deploying candidate model variants for Seldon Core 2 experiments.
- SeldonIO_Seldon_core_Experiment_Configuration — related principle — Declarative specification of traffic routing rules for A/B tests and traffic mirroring experiments.
- SeldonIO_Seldon_core_Experiment_Execution — related principle — Activating an experiment to begin traffic splitting or mirroring between model candidates.
Implementation:SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment