Implementation:Bentoml BentoML Cloud RestClient
| Knowledge Sources | |
|---|---|
| Domains | Cloud, REST API, HTTP Client |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Implements the BentoCloud REST API client classes (RestApiClient, RestApiClientV1, RestApiClientV2) that provide typed methods for all BentoCloud API endpoints including bento management, model management, deployments, clusters, secrets, and API tokens.
Description
This module is the HTTP transport layer for all BentoCloud interactions. It provides a comprehensive REST API client structured as follows:
BaseRestApiClient: Base class providing shared HTTP utilities:_is_not_found(resp): Checks for 404 status or legacy 400-with-"record not found" text._check_resp(resp): Validates response status. RaisesCloudRESTApiClientErrorfor 5xx errors (with trace ID support for debugging) and non-200 responses.
RestApiClientV1: Implements the v1 BentoCloud API endpoints (approximately 50 methods):- User/Org:
get_current_user(),get_current_organization() - Bento repositories:
get_bento_repository(),create_bento_repository(),get_bento_repositories_list() - Bento CRUD:
get_bento(),list_bentos(),create_bento(),update_bento(),get_bentos_list() - Bento upload/download:
presign_bento_upload_url(),presign_bento_download_url(),start_upload_bento(),finish_upload_bento(),upload_bento(),download_bento() - Bento multipart upload:
start_bento_multipart_upload(),presign_bento_multipart_upload_url(),complete_bento_multipart_upload() - Model repositories:
get_model_repository(),create_model_repository() - Model CRUD:
get_model(),create_model(),get_latest_model(),get_models_list() - Model upload/download: Same pattern as bento (presign, start, finish, upload, download, multipart)
- Deployments (v1):
create_deployment(),get_deployment(),update_deployment(),terminate_deployment(),delete_deployment(),get_cluster_deployment_list(),get_organization_deployment_list() - Clusters:
get_cluster_list(),get_cluster() - Secrets:
list_secrets(),create_secret(),get_secret(),delete_secret(),update_secret() - API tokens:
list_api_tokens(),create_api_token(),get_api_token(),delete_api_token()
- User/Org:
RestApiClientV2: Implements the v2 BentoCloud API endpoints:- Deployments (v2):
create_deployment(),update_deployment(),void_update_deployment(),get_deployment(),list_deployment(),terminate_deployment(),delete_deployment() - Instance types:
list_instance_types() - Pod management:
get_deployment_image_builder_pod(),list_deployment_pods()-- both use WebSocket connections to receive pod information. - Log streaming:
tail_logs()-- opens a WebSocket connection for real-time log streaming with heartbeat thread and stop event support. - Deployment files:
upload_files(),delete_files(),list_files()
- Deployments (v2):
RestApiClient: The top-level client that composesRestApiClientV1andRestApiClientV2. Constructor creates anhttpx.Clientsession with:- Base URL from the BentoCloud endpoint
X-YATAI-API-TOKENheader for authenticationX-Bentoml-Versionheader for version identification- Configurable timeout (default: 60 seconds)
All methods use schema_from_object and schema_to_object utility functions for serialization/deserialization of typed schema objects.
Usage
The RestApiClient is instantiated by BentoCloudClient and used throughout the BentoML cloud operations layer. It is the underlying HTTP transport for BentoAPI, ModelAPI, and deployment management operations.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/cloud/client.py
- Lines: 1-876
Signature
class BaseRestApiClient:
def __init__(self, session: httpx.Client) -> None: ...
@staticmethod
def _is_not_found(resp: httpx.Response) -> bool: ...
@staticmethod
def _check_resp(resp: httpx.Response) -> None: ...
class RestApiClientV1(BaseRestApiClient):
def get_current_user(self) -> UserSchema | None: ...
def get_bento_repository(self, bento_repository_name: str) -> BentoRepositorySchema | None: ...
def create_bento(self, bento_repository_name: str, req: CreateBentoSchema) -> BentoSchema: ...
def upload_bento(self, bento_repository_name: str, version: str, data: t.IO[bytes]) -> None: ...
def download_bento(self, bento_repository_name: str, version: str) -> t.Generator[httpx.Response, None, None]: ...
# ... (approximately 50 methods total)
class RestApiClientV2(BaseRestApiClient):
def create_deployment(self, create_schema: CreateDeploymentSchemaV2, cluster: str | None = None) -> DeploymentFullSchemaV2: ...
def tail_logs(self, *, cluster_name: str, namespace: str, pod_name: str, container_name: str = "main", stop_event: threading.Event) -> t.Generator[str, None, None]: ...
def list_deployment_pods(self, name: str, cluster: str | None = None) -> list[KubePodSchema]: ...
# ... (approximately 15 methods total)
class RestApiClient:
def __init__(self, endpoint: str, api_token: str, timeout: int = 60) -> None: ...
v1: RestApiClientV1
v2: RestApiClientV2
Import
from bentoml._internal.cloud.client import RestApiClient
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| endpoint | str | Yes | BentoCloud API base URL (e.g., "https://cloud.bentoml.com") |
| api_token | str | Yes | BentoCloud API authentication token |
| timeout | int | No (default: 60) | HTTP request timeout in seconds |
| bento_repository_name | str | Yes (for bento ops) | Name of the bento repository |
| version | str | Yes (for versioned ops) | Bento or model version string |
| data | IO[bytes] | Yes (for upload) | Binary data stream for file upload |
Outputs
| Name | Type | Description |
|---|---|---|
| Schema objects | Various *Schema types | Typed response objects (BentoSchema, ModelSchema, DeploymentFullSchema, etc.) |
| None | None | For void operations (upload, delete) |
| httpx.Response | streaming response | For download operations (via context manager) |
| Generator[str] | log lines | For tail_logs WebSocket streaming |
Usage Examples
from bentoml._internal.cloud.client import RestApiClient
# Create client
client = RestApiClient(
endpoint="https://cloud.bentoml.com",
api_token="your-api-token",
timeout=120,
)
# Get current user
user = client.v1.get_current_user()
# List bentos
bentos = client.v1.get_bentos_list()
# Get a specific bento
bento = client.v1.get_bento("my_service", "v1")
# Create a deployment (v2 API)
deployment = client.v2.create_deployment(create_schema, cluster="my-cluster")
# Tail deployment logs
import threading
stop = threading.Event()
for line in client.v2.tail_logs(
cluster_name="default",
namespace="production",
pod_name="my-pod-xyz",
stop_event=stop,
):
print(line)