Principle:Bentoml BentoML Model Versioning
| Principle Metadata | |
|---|---|
| Principle Name | Model Versioning |
| Workflow | Model_Store_Management |
| Domain | ML_Serving, Model_Management |
| Related Principle | Principle:Bentoml_BentoML_Model_Persistence |
| Implemented By | Implementation:Bentoml_BentoML_Models_List_Get |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Model Versioning is the principle of tracking, organizing, and querying ML model artifacts through tag-based identification. It enables teams to discover, compare, and manage model versions across their lifecycle, from initial training through deployment and eventual cleanup.
Core Concept
Version-tracking and organizing ML model artifacts using a tag-based system (name:version) provides immutable model identification. Listing and querying models enables discovery, comparison, and lifecycle management across development and production environments.
Theory
Tag-based versioning (name:version) provides immutable model identification. Each model in the store is uniquely identified by its tag, where:
- name groups related models (e.g.,
"text_classifier","image_encoder") - version distinguishes individual saves (auto-generated or user-specified)
Listing and querying models enables discovery, comparison, and lifecycle management. Each model carries metadata (labels, creation time, framework info) supporting organization and filtering across teams.
Immutable Identification
Once a model is saved with a specific tag, that tag permanently references the same artifact. There is no mechanism to update a model in place; a new save always produces a new version. This guarantees that deployments referencing a specific tag are deterministic and reproducible.
The "latest" Convention
The store maintains a "latest" symlink for each model name that points to the most recently saved version. When a tag is specified without a version (e.g., "my_model"), it resolves to "my_model:latest". This provides a convenient default while still allowing pinning to specific versions.
Metadata for Organization
Each model version carries structured metadata that supports organization:
- Labels: Key-value pairs designed for filtering and categorization (e.g.,
{"team": "nlp", "stage": "production"}) - Creation Time: Timestamp of when the model was saved, enabling chronological ordering
- Framework Info: The ML framework and Python version used, captured in
ModelContext - Custom Metadata: Arbitrary data such as accuracy metrics, training hyperparameters, or dataset identifiers
Querying and Discovery
The versioning system supports:
- List all models: Retrieve all models in the store, sorted by creation time
- Filter by name: List all versions of a specific model name
- Get exact version: Retrieve a specific model by its full
name:versiontag - Latest resolution: Automatically resolve unversioned tags to the latest version
Design Principles
Chronological Ordering
Models are sorted by creation time, making it straightforward to find the most recent version or track how a model has evolved over time.
Label-Based Filtering
Labels provide a flexible, user-defined categorization system. Unlike metadata (which is free-form), labels are intended to be used for querying and filtering:
# Save with labels
with bentoml.models.create("my_model", labels={"stage": "production"}) as m:
...
# List and filter
all_models = bentoml.models.list("my_model")
prod_models = [m for m in all_models if m.info.labels.get("stage") == "production"]
Tag Semantics
| Tag Format | Resolution Behavior |
|---|---|
"name" |
Resolves to "name:latest"
|
"name:latest" |
Resolves to the most recently saved version |
"name:version" |
Resolves to the exact version |
Relationship to Other Principles
- Model Persistence: Versioning relies on the persistence layer to create new versioned entries in the store.
- Model Loading From Store: The versioning system provides the tag resolution mechanism used by
BentoModel. - Model Cleanup: Deletion of specific versions interacts with the versioning system, particularly the
"latest"symlink. - Model Cloud Sync: Push and pull operations transfer specific versioned models.