Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bentoml BentoML Cloud RestClient

From Leeroopedia
Knowledge Sources
Domains Cloud, REST API, HTTP Client
Last Updated 2026-02-13 15:00 GMT

Overview

Implements the BentoCloud REST API client classes (RestApiClient, RestApiClientV1, RestApiClientV2) that provide typed methods for all BentoCloud API endpoints including bento management, model management, deployments, clusters, secrets, and API tokens.

Description

This module is the HTTP transport layer for all BentoCloud interactions. It provides a comprehensive REST API client structured as follows:

  1. BaseRestApiClient: Base class providing shared HTTP utilities:
    • _is_not_found(resp): Checks for 404 status or legacy 400-with-"record not found" text.
    • _check_resp(resp): Validates response status. Raises CloudRESTApiClientError for 5xx errors (with trace ID support for debugging) and non-200 responses.
  1. RestApiClientV1: Implements the v1 BentoCloud API endpoints (approximately 50 methods):
    • User/Org: get_current_user(), get_current_organization()
    • Bento repositories: get_bento_repository(), create_bento_repository(), get_bento_repositories_list()
    • Bento CRUD: get_bento(), list_bentos(), create_bento(), update_bento(), get_bentos_list()
    • Bento upload/download: presign_bento_upload_url(), presign_bento_download_url(), start_upload_bento(), finish_upload_bento(), upload_bento(), download_bento()
    • Bento multipart upload: start_bento_multipart_upload(), presign_bento_multipart_upload_url(), complete_bento_multipart_upload()
    • Model repositories: get_model_repository(), create_model_repository()
    • Model CRUD: get_model(), create_model(), get_latest_model(), get_models_list()
    • Model upload/download: Same pattern as bento (presign, start, finish, upload, download, multipart)
    • Deployments (v1): create_deployment(), get_deployment(), update_deployment(), terminate_deployment(), delete_deployment(), get_cluster_deployment_list(), get_organization_deployment_list()
    • Clusters: get_cluster_list(), get_cluster()
    • Secrets: list_secrets(), create_secret(), get_secret(), delete_secret(), update_secret()
    • API tokens: list_api_tokens(), create_api_token(), get_api_token(), delete_api_token()
  1. RestApiClientV2: Implements the v2 BentoCloud API endpoints:
    • Deployments (v2): create_deployment(), update_deployment(), void_update_deployment(), get_deployment(), list_deployment(), terminate_deployment(), delete_deployment()
    • Instance types: list_instance_types()
    • Pod management: get_deployment_image_builder_pod(), list_deployment_pods() -- both use WebSocket connections to receive pod information.
    • Log streaming: tail_logs() -- opens a WebSocket connection for real-time log streaming with heartbeat thread and stop event support.
    • Deployment files: upload_files(), delete_files(), list_files()
  1. RestApiClient: The top-level client that composes RestApiClientV1 and RestApiClientV2. Constructor creates an httpx.Client session with:
    • Base URL from the BentoCloud endpoint
    • X-YATAI-API-TOKEN header for authentication
    • X-Bentoml-Version header for version identification
    • Configurable timeout (default: 60 seconds)

All methods use schema_from_object and schema_to_object utility functions for serialization/deserialization of typed schema objects.

Usage

The RestApiClient is instantiated by BentoCloudClient and used throughout the BentoML cloud operations layer. It is the underlying HTTP transport for BentoAPI, ModelAPI, and deployment management operations.

Code Reference

Source Location

Signature

class BaseRestApiClient:
    def __init__(self, session: httpx.Client) -> None: ...
    @staticmethod
    def _is_not_found(resp: httpx.Response) -> bool: ...
    @staticmethod
    def _check_resp(resp: httpx.Response) -> None: ...

class RestApiClientV1(BaseRestApiClient):
    def get_current_user(self) -> UserSchema | None: ...
    def get_bento_repository(self, bento_repository_name: str) -> BentoRepositorySchema | None: ...
    def create_bento(self, bento_repository_name: str, req: CreateBentoSchema) -> BentoSchema: ...
    def upload_bento(self, bento_repository_name: str, version: str, data: t.IO[bytes]) -> None: ...
    def download_bento(self, bento_repository_name: str, version: str) -> t.Generator[httpx.Response, None, None]: ...
    # ... (approximately 50 methods total)

class RestApiClientV2(BaseRestApiClient):
    def create_deployment(self, create_schema: CreateDeploymentSchemaV2, cluster: str | None = None) -> DeploymentFullSchemaV2: ...
    def tail_logs(self, *, cluster_name: str, namespace: str, pod_name: str, container_name: str = "main", stop_event: threading.Event) -> t.Generator[str, None, None]: ...
    def list_deployment_pods(self, name: str, cluster: str | None = None) -> list[KubePodSchema]: ...
    # ... (approximately 15 methods total)

class RestApiClient:
    def __init__(self, endpoint: str, api_token: str, timeout: int = 60) -> None: ...
    v1: RestApiClientV1
    v2: RestApiClientV2

Import

from bentoml._internal.cloud.client import RestApiClient

I/O Contract

Inputs

Name Type Required Description
endpoint str Yes BentoCloud API base URL (e.g., "https://cloud.bentoml.com")
api_token str Yes BentoCloud API authentication token
timeout int No (default: 60) HTTP request timeout in seconds
bento_repository_name str Yes (for bento ops) Name of the bento repository
version str Yes (for versioned ops) Bento or model version string
data IO[bytes] Yes (for upload) Binary data stream for file upload

Outputs

Name Type Description
Schema objects Various *Schema types Typed response objects (BentoSchema, ModelSchema, DeploymentFullSchema, etc.)
None None For void operations (upload, delete)
httpx.Response streaming response For download operations (via context manager)
Generator[str] log lines For tail_logs WebSocket streaming

Usage Examples

from bentoml._internal.cloud.client import RestApiClient

# Create client
client = RestApiClient(
    endpoint="https://cloud.bentoml.com",
    api_token="your-api-token",
    timeout=120,
)

# Get current user
user = client.v1.get_current_user()

# List bentos
bentos = client.v1.get_bentos_list()

# Get a specific bento
bento = client.v1.get_bento("my_service", "v1")

# Create a deployment (v2 API)
deployment = client.v2.create_deployment(create_schema, cluster="my-cluster")

# Tail deployment logs
import threading
stop = threading.Event()
for line in client.v2.tail_logs(
    cluster_name="default",
    namespace="production",
    pod_name="my-pod-xyz",
    stop_event=stop,
):
    print(line)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment