Principle:Tensorflow Serving Version Promotion And Rollback
| Knowledge Sources | |
|---|---|
| Domains | Version_Management, Reliability |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
A version transition mechanism that orchestrates loading new model versions and unloading old ones according to a configurable policy that balances availability against resource consumption.
Description
Version promotion and rollback is managed by the AspiredVersionsManager, which uses an AspiredVersionPolicy to determine the ordering of load/unload operations during version transitions. Two policies are available:
- AvailabilityPreservingPolicy: Loads the new version first, then unloads the old one. Ensures zero downtime but temporarily requires resources for both versions.
- ResourcePreservingPolicy: Unloads the old version first, then loads the new one. Minimizes peak resource usage but creates a brief unavailability window.
The manager runs a periodic background thread that evaluates aspired versions against currently loaded versions and executes the policy's recommended action.
Usage
Choose AvailabilityPreservingPolicy (default) for production systems where downtime is unacceptable. Choose ResourcePreservingPolicy when GPU/RAM constraints make it impossible to hold two versions simultaneously.
Theoretical Basis
# Abstract version transition logic (NOT real implementation)
def manage_versions_periodic():
for servable_name in aspired_versions_map:
current = get_loaded_versions(servable_name)
aspired = aspired_versions_map[servable_name]
snapshots = build_state_snapshots(current, aspired)
action = policy.get_next_action(snapshots)
if action.type == LOAD:
basic_manager.load(action.servable_id)
elif action.type == UNLOAD:
basic_manager.unload(action.servable_id)
# AvailabilityPreserving: LOAD new → UNLOAD old
# ResourcePreserving: UNLOAD old → LOAD new