Principle:SeldonIO Seldon core Model Readiness Verification
| Property | Value |
|---|---|
| Principle Name | Model_Readiness_Verification |
| Overview | Polling mechanism that confirms a deployed model has been loaded and is available to serve inference requests. |
| Workflow | Model_Deployment |
| Domains | MLOps, Kubernetes |
| Related Implementation | SeldonIO_Seldon_core_Seldon_Model_Status |
| Last Updated | 2026-02-13 00:00 GMT |
Description
After submitting a Model resource, the model goes through several states (loading, downloading, etc.). Readiness verification uses the ModelAvailable condition to block until the model is fully loaded on an inference server and ready to receive requests. This step is critical in deployment pipelines and scripts to ensure that inference requests are not sent to models that are still being initialized.
The model readiness lifecycle follows a progression of states:
- ScheduleRequested: Model has been submitted to the scheduler
- Downloading: Artifact is being fetched from the storage backend
- Loading: The inference runtime is initializing the model in memory
- ModelAvailable: Model is fully loaded and ready for inference
- ModelFailed: Model failed to load (download error, incompatible artifact, etc.)
The readiness verification step monitors these transitions and provides a synchronization point between deployment and inference operations.
Theoretical Basis
Kubernetes readiness checks follow the condition-based status pattern. Each Kubernetes resource can expose a list of conditions in its status field, where each condition has a type, status (True/False/Unknown), and optional reason and message fields.
Seldon extends this pattern with model-specific conditions that reflect the inference server's actual state:
- ModelAvailable: Indicates the model is loaded and serving. This is the primary condition checked in deployment workflows.
- ModelFailed: Indicates a terminal failure during model loading. This triggers alerting and diagnostic workflows.
The -w (wait) flag on the Seldon CLI implements a polling loop that periodically queries the scheduler for the model's condition status. This is analogous to kubectl wait --for=condition=Available but tailored for Seldon's model lifecycle.
The condition-based approach provides several advantages over simple status polling:
- Semantic clarity: Conditions describe what is true about the resource, not just a state machine position
- Extensibility: New conditions can be added without breaking existing tooling
- Composability: Multiple conditions can be checked independently (e.g., model available AND explainer available)
Usage
This principle applies after deploying a model (via seldon model load or kubectl apply) and before sending inference requests. It is especially important in:
- CI/CD pipelines: Automated deployment scripts must wait for readiness before running integration tests
- Canary deployments: Verifying the new model version is ready before shifting traffic
- Batch orchestration: Ensuring model availability before launching batch inference jobs
# Wait for model to become available using Seldon CLI
seldon model status iris -w ModelAvailable
# Alternative: check via kubectl
kubectl wait --for=condition=ModelAvailable model/iris --timeout=300s
The wait mechanism will block until the condition is met or a timeout is reached, providing a reliable synchronization primitive for deployment workflows.
Related Pages
- SeldonIO_Seldon_core_Seldon_Model_Status implements SeldonIO_Seldon_core_Model_Readiness_Verification
- SeldonIO_Seldon_core_Model_Deployment_Execution precedes SeldonIO_Seldon_core_Model_Readiness_Verification
- SeldonIO_Seldon_core_V2_Inference_Protocol follows SeldonIO_Seldon_core_Model_Readiness_Verification
- Heuristic:SeldonIO_Seldon_core_Model_Load_Timeout_Tip