Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:SeldonIO Seldon core Experiment Lifecycle Management

From Leeroopedia
Revision as of 17:21, 16 February 2026 by Admin (talk | contribs) (Auto-imported from principles/SeldonIO_Seldon_core_Experiment_Lifecycle_Management.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Field Value
Overview Operational procedures for updating experiment weights, adding candidates, and concluding experiments.
Domains MLOps, Experimentation
Related Implementation SeldonIO_Seldon_core_Seldon_Experiment_Stop
Last Updated 2026-02-13 00:00 GMT

Description

Experiments can be updated in-place by resubmitting an Experiment CRD with modified weights or candidates. Stopping an experiment reverts traffic routing to the default model. Experiment version updates allow gradual rollout by shifting weights progressively from the current production model to the new candidate.

The experiment lifecycle consists of the following phases:

  • Creation: Define the Experiment CRD with initial candidates and weights.
  • Activation: Start the experiment to engage traffic routing.
  • Monitoring: Observe traffic distribution and candidate performance.
  • Update: Modify weights, add/remove candidates, or change mirror configuration by resubmitting the CRD.
  • Conclusion: Stop the experiment to revert routing, then optionally promote the winning candidate.

In-Place Updates

Experiments support in-place updates through the same mechanism used to start them. Resubmitting an Experiment CRD with the same metadata.name but different parameters causes the scheduler to update the routing table without interrupting traffic. This enables:

  • Weight shifting: Gradually increasing a candidate's weight (e.g., 10% to 25% to 50% to 100%) for canary-style rollouts
  • Candidate addition: Adding new candidates to a running experiment
  • Candidate removal: Removing underperforming candidates from the experiment
  • Mode changes: Switching from A/B testing to mirroring or vice versa

Experiment Conclusion

Stopping an experiment reverts all traffic routing to the default model. After stopping:

  • The default model endpoint returns to normal (non-experiment) routing
  • All candidates remain deployed but no longer receive experiment-routed traffic
  • Candidates can be individually unloaded if no longer needed

Theoretical Basis

Experiment lifecycle follows the same declarative pattern as other Seldon resources: update the CRD to change behavior, delete/stop to revert. This aligns with Kubernetes declarative management principles where the desired state is expressed as a resource specification and the system converges to that state.

Progressive weight shifting enables canary-style rollouts where traffic is gradually moved to the winning candidate. This approach:

  • Reduces risk: Small initial traffic percentages limit exposure to potential issues
  • Enables early detection: Problems with the new candidate are caught before full rollout
  • Supports rollback: Reverting to the previous weight distribution is a simple CRD update
  • Provides continuous validation: Each weight increase can be validated with traffic analysis before proceeding

The lifecycle model treats experiments as mutable resources that can transition through multiple configurations before being concluded. This differs from immutable experiment designs where each configuration change creates a new experiment.

Usage

This principle applies when concluding an A/B test, updating experiment parameters, or rolling out a winning candidate. Key scenarios:

Progressive Canary Rollout

  1. Start experiment with 90/10 split (production/candidate)
  2. Monitor for errors and performance regressions
  3. If candidate performs well, update to 70/30
  4. Continue monitoring; update to 50/50
  5. If still healthy, update to 10/90
  6. Finally, stop the experiment and promote the candidate as the new default

Immediate Rollback

  1. Detect issues with a candidate during an active experiment
  2. Stop the experiment immediately to revert all traffic to the default model
  3. Investigate the issue with the candidate model
  4. Optionally restart the experiment after fixing the issue

Experiment Conclusion

  1. Analyze traffic distribution and candidate performance
  2. Determine the winning candidate based on metrics
  3. Stop the experiment
  4. Promote the winning candidate by making it the new default model
  5. Unload losing candidates to free resources

Related Pages

Implementation:SeldonIO_Seldon_core_Seldon_Experiment_Stop

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment