Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML SyncHTTPClient For Cloud

From Leeroopedia

Overview

SyncHTTPClient For Cloud implements the Principle:Bentoml_BentoML_Cloud_Endpoint_Invocation principle by providing client methods to invoke deployed BentoML service endpoints, either through the Deployment.get_client() method or by instantiating SyncHTTPClient directly with a cloud URL.

API

  • Deployment.get_client()
  • SyncHTTPClient.from_url() (cloud context)

Source

  • get_client: src/bentoml/_internal/cloud/deployment.py:L598-631
  • SyncHTTPClient: src/bentoml/_internal/client/http.py:L237-288

Import

from bentoml.client import SyncHTTPClient

Signature

# Via Deployment object
deployment.get_client() -> SyncHTTPClient

# Via direct URL
SyncHTTPClient.from_url(server_url: str) -> SyncHTTPClient

Key Difference from Local

When used in a cloud context, the SyncHTTPClient differs from local usage in these ways:

Aspect Local Cloud
URL http://localhost:3000 Deployment endpoint URL
Authentication Not required May include auth headers from BentoCloud context
Creation SyncHTTPClient("http://localhost:3000") deployment.get_client() or SyncHTTPClient.from_url(url)

Inputs and Outputs

Inputs:

  • Deployment object - When using get_client(), the deployment's endpoint URL and authentication are handled automatically
  • Deployment endpoint URL - When using SyncHTTPClient.from_url(), the cloud endpoint URL must be provided directly

Outputs:

  • SyncHTTPClient instance with dynamically generated API methods matching the deployed service's endpoints

Usage Examples

Via Deployment Object

import bentoml

# Get an existing deployment
deployment = bentoml.deployment.get("my-llm-service")

# Get a client with auto-configured auth
client = deployment.get_client()

# Call the service endpoint
result = client.generate(prompt="Hello, world!", max_tokens=100)
print(result)

Via Direct URL

from bentoml.client import SyncHTTPClient

# Create client directly from cloud URL
client = SyncHTTPClient.from_url("https://my-llm-service-abc123.bentoml.com")

# Call the service endpoint
result = client.generate(prompt="Hello, world!", max_tokens=100)
print(result)

Within Create Flow

import bentoml

# Create deployment and immediately get a client
deployment = bentoml.deployment.create(
    bento="my_service:latest",
    scaling_min=1,
)

# Wait for deployment to be ready, then get client
client = deployment.get_client()
response = client.predict(input_data=[1.0, 2.0, 3.0])

Dynamic Method Generation

The client introspects the service's OpenAPI specification to generate methods. For a service defined as:

@bentoml.service
class MyService:
    @bentoml.api
    def predict(self, input_data: list[float]) -> float:
        ...

    @bentoml.api
    def classify(self, text: str) -> dict:
        ...

The client will have corresponding client.predict() and client.classify() methods with matching signatures.

Metadata

Property Value
Implementation SyncHTTPClient For Cloud
API Deployment.get_client() / SyncHTTPClient.from_url()
Source src/bentoml/_internal/cloud/deployment.py:L598-631, src/bentoml/_internal/client/http.py:L237-288
Domain ML_Serving, Cloud_Deployment, API_Design
Workflow BentoCloud_Deployment
Principle Principle:Bentoml_BentoML_Cloud_Endpoint_Invocation

Knowledge Sources

2026-02-13 15:00 GMT

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment