Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML GrpcClient

From Leeroopedia
Knowledge Sources
Domains Client, gRPC, Networking
Last Updated 2026-02-13 15:00 GMT

Overview

Implements gRPC client classes (GrpcClient, AsyncGrpcClient, SyncGrpcClient) for communicating with BentoML services over the gRPC protocol, supporting SSL/TLS, health checks, service metadata discovery, and automatic API endpoint binding.

Description

This module provides concrete gRPC transport implementations of the abstract client base classes defined in bentoml._internal.client. It includes three client classes:

  1. GrpcClient: A wrapper (deprecated) that holds both sync and async gRPC clients and delegates to them. Extends the deprecated Client base class.
  1. AsyncGrpcClient: The full-featured async gRPC client. Key capabilities:
    • Channel management: Creates either secure (aio.secure_channel) or insecure (aio.insecure_channel) gRPC channels. Supports custom channel options, interceptors, and compression settings.
    • SSL/TLS support: Accepts ClientCredentials (root certificates, private key, certificate chain) for secure connections. Credentials can be provided as file paths or raw bytes.
    • Health checking: wait_until_server_ready() performs gRPC health checks against the /grpc.health.v1.Health/Check endpoint, polling with configurable timeout and interval.
    • Service metadata discovery: from_url() connects to the server's ServiceMetadata RPC to retrieve API definitions, then constructs InferenceAPI objects from the returned IO descriptor specs. This enables dynamic client method creation without client-side service definitions.
    • RPC method caching: Uses @cached_property for _rpc_metadata and _rpc_methods, lazily building unary-unary RPC stubs with proper serializers/deserializers.
    • API invocation: _call() serializes input via the API's IO descriptor to_proto() method, sends the request, and deserializes the response via from_proto().
    • Protocol version handling: Supports multiple gRPC protocol versions. Raises an informative error with example code if protocol version is older than v1.
  1. SyncGrpcClient: Mirrors AsyncGrpcClient with synchronous blocking semantics. Uses grpc.Channel instead of aio.Channel. Uses asyncio.run() to bridge async IO descriptor serialization in synchronous context.

Both AsyncGrpcClient and SyncGrpcClient implement:

  • health(service_name, timeout): Performs a gRPC health check on a named service.
  • _split_channel_args(**kwargs): Separates gRPC channel kwargs (timeout, metadata, credentials, wait_for_ready, compression) from other kwargs.
  • close(): Closes the underlying gRPC channel.

Usage

gRPC clients are typically created via AsyncGrpcClient.from_url() or SyncGrpcClient.from_url(). They are selected automatically when SyncClient.from_url() or AsyncClient.from_url() encounters a gRPC server (detected via BadStatusLine from HTTP attempt or explicit kind="grpc"). Requires the bentoml[grpc] extra to be installed.

Code Reference

Source Location

Signature

class GrpcClient(Client):
    def __init__(self, svc: Service, server_url: str): ...

class AsyncGrpcClient(AsyncClient):
    def __init__(
        self,
        server_url: str,
        svc: Service,
        ssl: bool = False,
        channel_options: aio.ChannelArgumentType | None = None,
        interceptors: t.Sequence[aio.ClientInterceptor] | None = None,
        compression: grpc.Compression | None = None,
        ssl_client_credentials: ClientCredentials | None = None,
        *,
        protocol_version: str = LATEST_PROTOCOL_VERSION,
        **kwargs: t.Any,
    ): ...
    @classmethod
    async def from_url(cls, server_url: str, **kwargs: t.Any) -> AsyncGrpcClient: ...
    async def _call(self, inp: t.Any = None, *, _bentoml_api: InferenceAPI[t.Any], **attrs: t.Any) -> t.Any: ...
    async def health(self, service_name: str, *, timeout: int = 30) -> t.Any: ...
    @staticmethod
    async def wait_until_server_ready(host: str, port: int, timeout: float = 30, check_interval: int = 1, **kwargs: t.Any) -> None: ...

class SyncGrpcClient(SyncClient):
    def __init__(self, server_url: str, svc: Service, ssl: bool = False, ...): ...
    @classmethod
    def from_url(cls, server_url: str, **kwargs: t.Any) -> SyncGrpcClient: ...
    def _call(self, inp: t.Any = None, *, _bentoml_api: InferenceAPI[t.Any], **attrs: t.Any): ...
    def health(self, service_name: str, *, timeout: int = 30) -> t.Any: ...

Import

from bentoml._internal.client.grpc import GrpcClient, AsyncGrpcClient, SyncGrpcClient

I/O Contract

Inputs

Name Type Required Description
server_url str Yes gRPC server address (e.g., "localhost:3000")
svc Service Yes BentoML Service with API definitions (auto-discovered from server metadata)
ssl bool No (default: False) Whether to use SSL/TLS for the connection
ssl_client_credentials ClientCredentials No TLS credentials (root_certificates, private_key, certificate_chain)
channel_options ChannelArgumentType No gRPC channel configuration options
interceptors Sequence[ClientInterceptor] No gRPC client interceptors
compression grpc.Compression No gRPC compression algorithm
protocol_version str No (default: LATEST) BentoML gRPC protocol version string

Outputs

Name Type Description
result Any Deserialized output from the gRPC service API call (via IO descriptor from_proto)
health response HealthCheckResponse gRPC health check status
client instance AsyncGrpcClient / SyncGrpcClient Connected gRPC client with bound API methods

Usage Examples

from bentoml._internal.client.grpc import AsyncGrpcClient, SyncGrpcClient

# Async gRPC client
client = await AsyncGrpcClient.from_url("localhost:3000")
result = await client.predict(input_data)
await client.close()

# Sync gRPC client
client = SyncGrpcClient.from_url("localhost:3000")
result = client.predict(input_data)
client.close()

# With SSL
client = await AsyncGrpcClient.from_url(
    "localhost:3000",
    ssl=True,
    ssl_client_credentials={
        "root_certificates": "/path/to/ca.pem",
        "private_key": "/path/to/key.pem",
        "certificate_chain": "/path/to/cert.pem",
    },
)

# Health check
await AsyncGrpcClient.wait_until_server_ready("localhost", 3000, timeout=60)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment