Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bentoml BentoML Cloud ModelAPI

From Leeroopedia
Revision as of 12:06, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Bentoml_BentoML_Cloud_ModelAPI.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Cloud, Model Management, Remote Storage
Last Updated 2026-02-13 15:00 GMT

Overview

The ModelAPI class provides push, pull, list, and get operations for synchronizing BentoML models between local and remote (BentoCloud) model stores.

Description

ModelAPI is an attrs-frozen dataclass that wraps a RestApiClient and exposes methods for uploading (pushing) and downloading (pulling) ML models to and from a remote BentoCloud model store. It supports multiple transmission strategies including direct proxy upload, presigned URL upload, and multipart upload with chunking and retry logic. The class uses a Spinner for progress reporting and a threading Lock to protect concurrent model repository creation.

Key internal behaviors:

  • Push: Creates a tar archive of the local model, determines the transmission strategy (proxy, presigned URL, or multipart), uploads the archive with progress tracking, and finalizes the upload status.
  • Pull: Resolves the model version (including "latest"), downloads the tar archive via proxy or presigned URL, extracts it into a temporary directory, and saves it to the local model store.
  • List: Retrieves all models from the remote store sorted by creation date.
  • Get: Fetches a specific model by name and version from the remote store.

Usage

Used internally by BentoML's cloud integration layer when users push or pull models to/from BentoCloud. Not typically instantiated directly by end users but invoked through the BentoCloud client CLI or SDK commands.

Code Reference

Source Location

Signature

@attrs.frozen
class ModelAPI:
    _client: RestApiClient
    spinner: Spinner
    _lock: Lock

    def push(self, model: Model[t.Any], *, force: bool = False, threads: int = 10) -> None: ...

    def pull(self, tag: str | Tag, *, force: bool = False,
             model_store: ModelStore = ..., query: str | None = None) -> StoredModel | None: ...

    def list(self) -> ModelWithRepositoryListSchema: ...

    def get(self, name: str, version: str | None = None) -> ModelSchema: ...

Import

from bentoml._internal.cloud.model import ModelAPI

I/O Contract

Inputs

push()

Name Type Required Description
model Model[Any] Yes The BentoML model instance to push to the remote store
force bool No Whether to force push even if the model already exists remotely (default: False)
threads int No Number of threads for multipart upload (default: 10)

pull()

Name Type Required Description
tag str or Tag Yes The tag of the model to pull from the remote store
force bool No Whether to force pull even if the model exists locally (default: False)
model_store ModelStore No The local model store to save to (injected via DI)
query str or None No Query string for filtering the latest model version

get()

Name Type Required Description
name str Yes The name of the model repository
version str or None No The specific version to retrieve; if None or "latest", fetches the latest

Outputs

Method Return Type Description
push() None Uploads the model; raises BentoMLException on failure
pull() StoredModel or None The pulled model instance, or None if the model has no downloadable content (e.g., HuggingFace registry)
list() ModelWithRepositoryListSchema List of all models in the remote store sorted by creation date
get() ModelSchema The model schema from the remote store; raises NotFound if not found

Usage Examples

# Typically used through BentoML CLI or BentoCloudClient
from bentoml._internal.cloud.model import ModelAPI
from bentoml._internal.cloud.client import RestApiClient

# Instantiate (normally done internally)
client = RestApiClient(...)
model_api = ModelAPI(client=client)

# Push a model to remote store
from _bentoml_sdk.models import BentoModel
model = BentoModel.get("my_model:latest")
model_api.push(model, force=False, threads=10)

# Pull a model from remote store
pulled_model = model_api.pull("my_model:latest", force=False)

# List all remote models
all_models = model_api.list()

# Get a specific model
model_schema = model_api.get("my_model", version="v1")

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment