Implementation:Bentoml BentoML Cloud ModelAPI
| Knowledge Sources | |
|---|---|
| Domains | Cloud, Model Management, Remote Storage |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
The ModelAPI class provides push, pull, list, and get operations for synchronizing BentoML models between local and remote (BentoCloud) model stores.
Description
ModelAPI is an attrs-frozen dataclass that wraps a RestApiClient and exposes methods for uploading (pushing) and downloading (pulling) ML models to and from a remote BentoCloud model store. It supports multiple transmission strategies including direct proxy upload, presigned URL upload, and multipart upload with chunking and retry logic. The class uses a Spinner for progress reporting and a threading Lock to protect concurrent model repository creation.
Key internal behaviors:
- Push: Creates a tar archive of the local model, determines the transmission strategy (proxy, presigned URL, or multipart), uploads the archive with progress tracking, and finalizes the upload status.
- Pull: Resolves the model version (including "latest"), downloads the tar archive via proxy or presigned URL, extracts it into a temporary directory, and saves it to the local model store.
- List: Retrieves all models from the remote store sorted by creation date.
- Get: Fetches a specific model by name and version from the remote store.
Usage
Used internally by BentoML's cloud integration layer when users push or pull models to/from BentoCloud. Not typically instantiated directly by end users but invoked through the BentoCloud client CLI or SDK commands.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/cloud/model.py
- Lines: 1-541
Signature
@attrs.frozen
class ModelAPI:
_client: RestApiClient
spinner: Spinner
_lock: Lock
def push(self, model: Model[t.Any], *, force: bool = False, threads: int = 10) -> None: ...
def pull(self, tag: str | Tag, *, force: bool = False,
model_store: ModelStore = ..., query: str | None = None) -> StoredModel | None: ...
def list(self) -> ModelWithRepositoryListSchema: ...
def get(self, name: str, version: str | None = None) -> ModelSchema: ...
Import
from bentoml._internal.cloud.model import ModelAPI
I/O Contract
Inputs
push()
| Name | Type | Required | Description |
|---|---|---|---|
| model | Model[Any] | Yes | The BentoML model instance to push to the remote store |
| force | bool | No | Whether to force push even if the model already exists remotely (default: False) |
| threads | int | No | Number of threads for multipart upload (default: 10) |
pull()
| Name | Type | Required | Description |
|---|---|---|---|
| tag | str or Tag | Yes | The tag of the model to pull from the remote store |
| force | bool | No | Whether to force pull even if the model exists locally (default: False) |
| model_store | ModelStore | No | The local model store to save to (injected via DI) |
| query | str or None | No | Query string for filtering the latest model version |
get()
| Name | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | The name of the model repository |
| version | str or None | No | The specific version to retrieve; if None or "latest", fetches the latest |
Outputs
| Method | Return Type | Description |
|---|---|---|
| push() | None | Uploads the model; raises BentoMLException on failure |
| pull() | StoredModel or None | The pulled model instance, or None if the model has no downloadable content (e.g., HuggingFace registry) |
| list() | ModelWithRepositoryListSchema | List of all models in the remote store sorted by creation date |
| get() | ModelSchema | The model schema from the remote store; raises NotFound if not found |
Usage Examples
# Typically used through BentoML CLI or BentoCloudClient
from bentoml._internal.cloud.model import ModelAPI
from bentoml._internal.cloud.client import RestApiClient
# Instantiate (normally done internally)
client = RestApiClient(...)
model_api = ModelAPI(client=client)
# Push a model to remote store
from _bentoml_sdk.models import BentoModel
model = BentoModel.get("my_model:latest")
model_api.push(model, force=False, threads=10)
# Pull a model from remote store
pulled_model = model_api.pull("my_model:latest", force=False)
# List all remote models
all_models = model_api.list()
# Get a specific model
model_schema = model_api.get("my_model", version="v1")