Principle:Bentoml BentoML Model Cloud Sync
| Principle Metadata | |
|---|---|
| Principle Name | Model Cloud Sync |
| Workflow | Model_Store_Management |
| Domain | ML_Serving, Model_Management, Cloud_Deployment |
| Related Principle | Principle:Bentoml_BentoML_Model_Export_Import, Principle:Bentoml_BentoML_Model_Persistence |
| Implemented By | Implementation:Bentoml_BentoML_Models_Push_Pull |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Model Cloud Sync is the principle of synchronizing model artifacts between a local BentoML model store and BentoCloud's centralized registry. It enables bi-directional transfer of models, supporting collaborative workflows where models trained locally are shared via a centralized registry and pulled into deployment environments.
Core Concept
Push/pull operations enable bi-directional synchronization between local model stores and BentoCloud's centralized registry. This supports team workflows where models trained locally are pushed to a shared registry, then pulled by deployment environments or other team members.
Theory
The cloud sync mechanism addresses the fundamental challenge of distributing ML models across distributed teams and environments:
- Push: Uploads a model from the local store to BentoCloud, making it available to all authenticated users and deployment environments.
- Pull: Downloads a model from BentoCloud to the local store, making it available for local serving, testing, or further development.
Multipart Transfer
Large ML models (often gigabytes in size) require efficient transfer mechanisms. BentoML uses multipart upload and download with parallel threads (default: 10 threads) to maximize throughput. This breaks the model into chunks that are transferred concurrently, significantly reducing transfer time for large artifacts.
Centralized Registry
BentoCloud serves as the single source of truth for model artifacts in team environments. This provides:
- Discovery: Team members can browse and search for available models
- Access Control: BentoCloud manages authentication and authorization
- Deployment Integration: Deployment pipelines can pull models directly from the registry
- Version History: The full version history of each model is maintained centrally
Design Principles
Idempotent Operations
Push and pull operations are designed to be idempotent. Pushing a model that already exists in BentoCloud (with the same tag) is a no-op unless the force flag is set. Similarly, pulling a model that already exists locally skips the download.
Force Overwrite
The force parameter allows explicit overwrite of existing models, both for push (overwrite in BentoCloud) and pull (overwrite in local store). This is useful for correcting mistakes or updating models that were saved with the same tag.
Authentication Required
All cloud sync operations require an authenticated BentoCloud session. This ensures that model artifacts are only accessible to authorized users and prevents unauthorized access to proprietary models.
Complementary to Export/Import
Cloud sync and export/import serve complementary purposes:
| Aspect | Cloud Sync (Push/Pull) | Export/Import |
|---|---|---|
| Requires Registry | Yes (BentoCloud) | No |
| Transfer Mechanism | HTTP multipart | File copy / fsspec |
| Authentication | BentoCloud credentials | Protocol-specific (e.g., S3 keys) |
| Team Discovery | Built-in (BentoCloud UI/API) | Manual sharing |
| Best For | Team collaboration, deployment | Backup, air-gapped transfer, CI artifacts |
Workflow Patterns
# Data scientist trains and saves a model
with bentoml.models.create("sentiment_model", labels={"stage": "candidate"}) as m:
save_model(m.path)
# Push to BentoCloud for team review
bentoml.models.push("sentiment_model:latest")
Deploy from Registry
# Deployment environment pulls the approved model
bentoml.models.pull("sentiment_model:v1_approved")
# Or use BentoModel for automatic pull in services
@bentoml.service
class SentimentService:
model = BentoModel("sentiment_model:v1_approved")
Relationship to Other Principles
- Model Persistence: Models must be persisted locally before they can be pushed.
- Model Loading From Store:
BentoModelcan automatically pull models during resolution, leveraging this principle transparently. - Model Export/Import: Provides an alternative distribution mechanism for environments without BentoCloud access.
- Model Versioning: Push/pull operations transfer specific versioned models, preserving tag semantics.