Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:SeldonIO Seldon core Candidate Model Deployment

From Leeroopedia
Revision as of 18:01, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/SeldonIO_Seldon_core_Candidate_Model_Deployment.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
Overview Deploying multiple model variants as candidates for A/B testing or traffic mirroring experiments.
Domains MLOps, Experimentation
Related Implementation SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment
Last Updated 2026-02-13 00:00 GMT

Description

Before starting an experiment, all candidate model variants must be deployed and available. Each variant is a separate Model resource with its own artifact and configuration. Candidates can be different model versions, algorithms, or configurations that will receive traffic according to experiment weights.

In Seldon Core 2, a candidate model is defined as a standard Model custom resource (CRD) with:

  • A unique metadata.name identifying the candidate
  • A spec.storageUri pointing to the model artifact (e.g., on GCS, S3, or local storage)
  • A spec.requirements list specifying runtime dependencies (e.g., sklearn, xgboost)

Each candidate model is loaded independently through the Seldon scheduler, which provisions the necessary inference server and makes the model available for serving. All candidates must reach a ready state before an experiment can begin routing traffic to them.

Theoretical Basis

A/B testing requires independent, isolated model deployments so that each candidate's predictions are produced under identical runtime conditions. Statistical validity requires that the only variable is the model itself, not the serving infrastructure.

Key principles underlying candidate model deployment for experimentation:

  • Isolation: Each candidate runs in its own inference server instance. This prevents resource contention and ensures that latency or throughput differences are attributable to the model, not to shared infrastructure.
  • Reproducibility: Candidates are defined declaratively via YAML manifests, making deployments repeatable and version-controllable.
  • Independence: Candidates are loaded and unloaded independently. Failure of one candidate does not affect other candidates or the default model.
  • Homogeneous Runtime: All candidates use the same inference protocol (V2/Open Inference Protocol), ensuring that the request format and response schema are consistent across variants.

Usage

This principle applies when preparing model variants for comparison in an A/B test or canary deployment. Typical scenarios include:

  • Comparing a new model version against the current production model
  • Evaluating different algorithms trained on the same dataset
  • Testing different hyperparameter configurations
  • Running a canary deployment where a small percentage of traffic goes to a new model before full rollout

The workflow is:

  1. Define each candidate as a separate Model CRD YAML manifest
  2. Load each candidate using seldon model load or kubectl apply
  3. Verify each candidate reaches a ready state using seldon model status
  4. Proceed to create and start an Experiment that references the candidates

Related Pages

Implementation:SeldonIO_Seldon_core_Seldon_Model_Load_For_Experiment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment