Principle:SeldonIO Seldon core Candidate Model Deployment

Field	Value
Overview	Deploying multiple model variants as candidates for A/B testing or traffic mirroring experiments.
Domains	MLOps, Experimentation
Related Implementation	SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment
Last Updated	2026-02-13 00:00 GMT

Description

Before starting an experiment, all candidate model variants must be deployed and available. Each variant is a separate Model resource with its own artifact and configuration. Candidates can be different model versions, algorithms, or configurations that will receive traffic according to experiment weights.

In Seldon Core 2, a candidate model is defined as a standard Model custom resource (CRD) with:

A unique metadata.name identifying the candidate
A spec.storageUri pointing to the model artifact (e.g., on GCS, S3, or local storage)
A spec.requirements list specifying runtime dependencies (e.g., sklearn, xgboost)

Each candidate model is loaded independently through the Seldon scheduler, which provisions the necessary inference server and makes the model available for serving. All candidates must reach a ready state before an experiment can begin routing traffic to them.

Theoretical Basis

A/B testing requires independent, isolated model deployments so that each candidate's predictions are produced under identical runtime conditions. Statistical validity requires that the only variable is the model itself, not the serving infrastructure.

Key principles underlying candidate model deployment for experimentation:

Isolation: Each candidate runs in its own inference server instance. This prevents resource contention and ensures that latency or throughput differences are attributable to the model, not to shared infrastructure.
Reproducibility: Candidates are defined declaratively via YAML manifests, making deployments repeatable and version-controllable.
Independence: Candidates are loaded and unloaded independently. Failure of one candidate does not affect other candidates or the default model.
Homogeneous Runtime: All candidates use the same inference protocol (V2/Open Inference Protocol), ensuring that the request format and response schema are consistent across variants.

Usage

This principle applies when preparing model variants for comparison in an A/B test or canary deployment. Typical scenarios include:

Comparing a new model version against the current production model
Evaluating different algorithms trained on the same dataset
Testing different hyperparameter configurations
Running a canary deployment where a small percentage of traffic goes to a new model before full rollout

The workflow is:

Define each candidate as a separate Model CRD YAML manifest
Load each candidate using seldon model load or kubectl apply
Verify each candidate reaches a ready state using seldon model status
Proceed to create and start an Experiment that references the candidates

Related Pages

SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment — implements this principle — Concrete CLI tool for deploying candidate model variants for Seldon Core 2 experiments.
SeldonIO_Seldon_core_Experiment_Configuration — related principle — Declarative specification of traffic routing rules for A/B tests and traffic mirroring experiments.
SeldonIO_Seldon_core_Experiment_Execution — related principle — Activating an experiment to begin traffic splitting or mirroring between model candidates.

Implementation:SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment