Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Bentoml BentoML Model Cloud Sync

From Leeroopedia
Principle Metadata
Principle Name Model Cloud Sync
Workflow Model_Store_Management
Domain ML_Serving, Model_Management, Cloud_Deployment
Related Principle Principle:Bentoml_BentoML_Model_Export_Import, Principle:Bentoml_BentoML_Model_Persistence
Implemented By Implementation:Bentoml_BentoML_Models_Push_Pull
Last Updated 2026-02-13 15:00 GMT

Overview

Model Cloud Sync is the principle of synchronizing model artifacts between a local BentoML model store and BentoCloud's centralized registry. It enables bi-directional transfer of models, supporting collaborative workflows where models trained locally are shared via a centralized registry and pulled into deployment environments.

Core Concept

Push/pull operations enable bi-directional synchronization between local model stores and BentoCloud's centralized registry. This supports team workflows where models trained locally are pushed to a shared registry, then pulled by deployment environments or other team members.

Theory

The cloud sync mechanism addresses the fundamental challenge of distributing ML models across distributed teams and environments:

  • Push: Uploads a model from the local store to BentoCloud, making it available to all authenticated users and deployment environments.
  • Pull: Downloads a model from BentoCloud to the local store, making it available for local serving, testing, or further development.

Multipart Transfer

Large ML models (often gigabytes in size) require efficient transfer mechanisms. BentoML uses multipart upload and download with parallel threads (default: 10 threads) to maximize throughput. This breaks the model into chunks that are transferred concurrently, significantly reducing transfer time for large artifacts.

Centralized Registry

BentoCloud serves as the single source of truth for model artifacts in team environments. This provides:

  • Discovery: Team members can browse and search for available models
  • Access Control: BentoCloud manages authentication and authorization
  • Deployment Integration: Deployment pipelines can pull models directly from the registry
  • Version History: The full version history of each model is maintained centrally

Design Principles

Idempotent Operations

Push and pull operations are designed to be idempotent. Pushing a model that already exists in BentoCloud (with the same tag) is a no-op unless the force flag is set. Similarly, pulling a model that already exists locally skips the download.

Force Overwrite

The force parameter allows explicit overwrite of existing models, both for push (overwrite in BentoCloud) and pull (overwrite in local store). This is useful for correcting mistakes or updating models that were saved with the same tag.

Authentication Required

All cloud sync operations require an authenticated BentoCloud session. This ensures that model artifacts are only accessible to authorized users and prevents unauthorized access to proprietary models.

Complementary to Export/Import

Cloud sync and export/import serve complementary purposes:

Aspect Cloud Sync (Push/Pull) Export/Import
Requires Registry Yes (BentoCloud) No
Transfer Mechanism HTTP multipart File copy / fsspec
Authentication BentoCloud credentials Protocol-specific (e.g., S3 keys)
Team Discovery Built-in (BentoCloud UI/API) Manual sharing
Best For Team collaboration, deployment Backup, air-gapped transfer, CI artifacts

Workflow Patterns

Train and Share

# Data scientist trains and saves a model
with bentoml.models.create("sentiment_model", labels={"stage": "candidate"}) as m:
    save_model(m.path)

# Push to BentoCloud for team review
bentoml.models.push("sentiment_model:latest")

Deploy from Registry

# Deployment environment pulls the approved model
bentoml.models.pull("sentiment_model:v1_approved")

# Or use BentoModel for automatic pull in services
@bentoml.service
class SentimentService:
    model = BentoModel("sentiment_model:v1_approved")

Relationship to Other Principles

  • Model Persistence: Models must be persisted locally before they can be pushed.
  • Model Loading From Store: BentoModel can automatically pull models during resolution, leveraging this principle transparently.
  • Model Export/Import: Provides an alternative distribution mechanism for environments without BentoCloud access.
  • Model Versioning: Push/pull operations transfer specific versioned models, preserving tag semantics.

Knowledge Sources

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment