Implementation:Bentoml BentoML Client Base
| Knowledge Sources | |
|---|---|
| Domains | Client, Networking, Abstract Base Classes |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Defines the abstract base classes for BentoML service clients, providing the Client (deprecated), SyncClient, and AsyncClient ABCs that implement auto-discovery of API endpoints and support both HTTP and gRPC transports.
Description
This module establishes the client abstraction layer for communicating with BentoML services. It provides three client base classes:
Client(deprecated): The original client class that wraps both sync and async clients. It is deprecated in favor ofSyncClientandAsyncClientand will be removed in BentoML 2.0. Key features:- Accepts a
Serviceobject and server URL in its constructor. - Dynamically creates methods matching each API endpoint name via
functools.partial. - Provides both
call()(sync) andasync_call()(async) dispatch methods. - Implements context manager protocol (
__enter__/__exit__and__aenter__/__aexit__). - Static factory method
from_url()with overloads for"http","grpc", and"auto"transport kinds.
- Accepts a
AsyncClient: The abstract async client base class for making asynchronous calls.- Constructor discovers all API endpoints from the
Serviceobject and creates bound partial methods. call(bentoml_api_name, inp, **kwargs): Public method to invoke a named API._call(inp, *, _bentoml_api, **kwargs): Abstract method that subclasses (HTTP, gRPC) must implement.wait_until_server_ready(host, port, timeout): Static method that tries HTTP first, falls back to gRPC onBadStatusLine.from_url(server_url, *, kind): Class factory with auto-detection that tries HTTP first, falls back to gRPC.
- Constructor discovers all API endpoints from the
SyncClient: The abstract synchronous client base class, mirroringAsyncClientwith blocking methods.- Same endpoint discovery and method creation pattern.
call()and abstract_call()for synchronous invocation.wait_until_server_ready()with HTTP-first, gRPC-fallback logic.from_url()with auto-detection.
All three classes raise BentoMLException if no APIs are found during construction or if an invalid transport kind is specified.
Usage
These base classes are not used directly. Concrete subclasses (SyncHTTPClient, AsyncHTTPClient, SyncGrpcClient, AsyncGrpcClient) implement the abstract _call method for their respective transport. Users typically create clients via SyncClient.from_url() or AsyncClient.from_url(), which auto-detect the transport protocol.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/client/__init__.py
- Lines: 1-371
Signature
class Client(ABC):
server_url: str
_svc: Service
endpoints: list[str]
def __init__(self, svc: Service, server_url: str): ...
def call(self, bentoml_api_name: str, inp: t.Any = None, **kwargs: t.Any) -> t.Any: ...
async def async_call(self, bentoml_api_name: str, inp: t.Any = None, **kwargs: t.Any) -> t.Any: ...
@staticmethod
def from_url(server_url: str, *, kind: t.Literal["auto", "http", "grpc"] | None = None, **kwargs: t.Any) -> Client: ...
class AsyncClient(ABC):
def __init__(self, svc: Service, server_url: str): ...
async def call(self, bentoml_api_name: str, inp: t.Any = None, **kwargs: t.Any) -> t.Any: ...
@abstractmethod
async def _call(self, inp: t.Any = None, *, _bentoml_api: InferenceAPI[t.Any], **kwargs: t.Any) -> t.Any: ...
@classmethod
async def from_url(cls, server_url: str, *, kind: t.Literal["auto", "http", "grpc"] | None = None, **kwargs: t.Any) -> AsyncClient: ...
class SyncClient(Client):
def __init__(self, svc: Service, server_url: str): ...
def call(self, bentoml_api_name: str, inp: t.Any = None, **kwargs: t.Any) -> t.Any: ...
@abstractmethod
def _call(self, inp: t.Any = None, *, _bentoml_api: InferenceAPI[t.Any], **kwargs: t.Any) -> t.Any: ...
@classmethod
def from_url(cls, server_url: str, *, kind: t.Literal["auto", "http", "grpc"] | None = None, **kwargs: t.Any) -> SyncClient: ...
Import
from bentoml._internal.client import Client, AsyncClient, SyncClient
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| svc | Service | Yes | A BentoML Service object containing API endpoint definitions |
| server_url | str | Yes | URL of the running BentoML service (e.g., "http://localhost:3000") |
| kind | Literal["auto", "http", "grpc"] or None | No | Transport protocol to use; "auto" tries HTTP first, then gRPC |
| bentoml_api_name | str | Yes (for call) | Name of the API endpoint to invoke |
| inp | Any | No | Input data to send to the API endpoint |
Outputs
| Name | Type | Description |
|---|---|---|
| result | Any | The deserialized response from the service API endpoint |
| Client instance | Client / AsyncClient / SyncClient | A connected client ready to make API calls |
Usage Examples
from bentoml._internal.client import SyncClient, AsyncClient
# Synchronous client with auto-detection
client = SyncClient.from_url("http://localhost:3000")
result = client.call("predict", {"data": [1, 2, 3]})
client.close()
# Using context manager
with SyncClient.from_url("http://localhost:3000") as client:
result = client.predict({"data": [1, 2, 3]})
# Async client
async with await AsyncClient.from_url("http://localhost:3000") as client:
result = await client.call("predict", {"data": [1, 2, 3]})
# Wait for server readiness
SyncClient.wait_until_server_ready("localhost", 3000, timeout=60)