Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML Models Push Pull

From Leeroopedia
Implementation Metadata
Implementation Name Models Push Pull
API bentoml.models.push(), bentoml.models.pull()
Source src/bentoml/models.py:L211-233 (public API), src/bentoml/_internal/cloud/model.py:L56-507 (implementation)
Workflow Model_Store_Management
Domain ML_Serving, Model_Management, Cloud_Deployment
Implements Principle:Bentoml_BentoML_Model_Cloud_Sync
Last Updated 2026-02-13 15:00 GMT

Overview

The bentoml.models.push() and bentoml.models.pull() functions provide bi-directional synchronization between the local BentoML model store and BentoCloud's centralized registry. They use multipart upload/download with parallel threads for efficient transfer of large model artifacts.

Import

import bentoml

Signatures

def push(tag: Tag | str, *, force: bool = False) -> None

def pull(tag: Tag | str, *, force: bool = False) -> Model | None

Parameters

bentoml.models.push()

Parameter Type Default Description
tag str required The model tag to push from the local store to BentoCloud.
force bool False If True, overwrite the model in BentoCloud even if a model with the same tag already exists.

bentoml.models.pull()

Parameter Type Default Description
tag str required The model tag to pull from BentoCloud to the local store.
force bool False If True, overwrite the local model even if a model with the same tag already exists locally.

Inputs and Outputs

push()

Inputs:

  • Model tag (must exist in local store); requires authenticated BentoCloud session

Outputs:

  • None — the model is uploaded to BentoCloud as a side effect

pull()

Inputs:

  • Model tag (must exist in BentoCloud); requires authenticated BentoCloud session

Outputs:

  • Model | None — the pulled model instance, or None if the pull failed

Internal Implementation Details

The public API functions in src/bentoml/models.py delegate to the cloud implementation in src/bentoml/_internal/cloud/model.py:

  • Multipart Upload/Download: Model artifacts are split into chunks and transferred using parallel threads (threads=10 by default) for maximum throughput.
  • Progress Tracking: Transfer progress is reported during push/pull operations.
  • Manifest Management: The cloud implementation manages model manifests that track the list of files and their checksums for integrity verification.

Usage Examples

import bentoml

# Push a model to BentoCloud
bentoml.models.push("text_classifier:latest")

# Push with force overwrite
bentoml.models.push("text_classifier:v2", force=True)

# Pull a model from BentoCloud
model = bentoml.models.pull("text_classifier:production")
if model:
    print(f"Pulled: {model.tag} to {model.path}")

# Pull with force overwrite of local copy
model = bentoml.models.pull("text_classifier:production", force=True)

Authentication

Push and pull operations require an authenticated BentoCloud session. Authentication is typically configured via:

bentoml cloud login --api-token <token> --endpoint <endpoint>

Or by setting environment variables:

export BENTOCLOUD_API_TOKEN=<token>
export BENTOCLOUD_ENDPOINT=<endpoint>

Behavior Details

  • Push: Reads the model from the local store, uploads artifact files via multipart HTTP upload to BentoCloud, and registers the model metadata. If the model already exists in BentoCloud and force=False, the operation is skipped.
  • Pull: Downloads model artifact files via multipart HTTP download from BentoCloud and reconstructs the model in the local store. If the model already exists locally and force=False, the operation is skipped.
  • Parallel Transfer: Both push and pull use 10 concurrent threads by default for transferring file chunks, significantly improving throughput for large models.
  • Integrity: File checksums are verified after transfer to ensure data integrity.

Source Reference

  • Public API: src/bentoml/models.py, lines 211-233
  • Cloud implementation: src/bentoml/_internal/cloud/model.py, lines 56-507

Related Pages

Knowledge Sources

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment