Principle:Bentoml BentoML Deployment Termination
Overview
Deployment Termination is the principle of gracefully stopping and cleaning up cloud deployments through a two-phase approach that separates stopping a service from removing its record entirely.
Concept
Gracefully stopping and cleaning up cloud deployments requires distinguishing between two distinct operations: terminating a running service (stopping execution while preserving the deployment record) and deleting a deployment (removing it entirely). This separation provides operational flexibility and safety.
Theory
Deployment termination involves two distinct operations: terminate (stop the running service but keep the deployment record) and delete (remove entirely). This two-phase approach allows cost savings by stopping without losing deployment history and configuration. The key benefits include:
- Cost management - Terminate stops all running instances, eliminating compute costs while preserving the ability to restart quickly
- History preservation - Terminated deployments retain their configuration, logs, and metadata for auditing and future reference
- Quick restart - A terminated deployment can be restarted without recreating the entire configuration
- Clean removal - Delete permanently removes all traces of a deployment when it is no longer needed
- Safety - The two-phase approach prevents accidental permanent deletion of deployment configurations
Terminate vs Delete
| Aspect | Terminate | Delete |
|---|---|---|
| Running instances | Stopped | Stopped |
| Deployment record | Preserved | Removed |
| Configuration | Preserved | Removed |
| Logs and history | Accessible | Removed |
| Can restart | Yes | No |
| Compute cost | None | None |
| Reversible | Yes (can restart) | No |
Termination Flow
- Graceful shutdown - Running instances receive a shutdown signal and are given time to complete in-flight requests
- Instance teardown - All replicas are stopped and compute resources are released
- State update - The deployment record is updated to "terminated" status
- Resource cleanup - GPU allocations, network resources, and storage are released
Use Cases
Terminate
- Off-hours cost savings - Stop services that are not needed outside business hours
- Debugging - Stop a misbehaving service while preserving its configuration for investigation
- Staging environment management - Terminate staging deployments when not actively testing
- Capacity management - Free up cluster resources temporarily
Delete
- Decommissioning - Permanently remove services that are no longer needed
- Cleanup - Remove failed or abandoned deployments
- Resource hygiene - Keep the deployment list clean and manageable
Metadata
| Property | Value |
|---|---|
| Principle | Deployment Termination |
| Domain | ML_Serving, Cloud_Deployment, Operations |
| Workflow | BentoCloud_Deployment |
| Related Concepts | Graceful Shutdown, Resource Management, Cost Optimization |
| Implementation | Implementation:Bentoml_BentoML_Deployment_Terminate_Delete |