Implementation:Bentoml BentoML SyncHTTPClient For Cloud
Overview
SyncHTTPClient For Cloud implements the Principle:Bentoml_BentoML_Cloud_Endpoint_Invocation principle by providing client methods to invoke deployed BentoML service endpoints, either through the Deployment.get_client() method or by instantiating SyncHTTPClient directly with a cloud URL.
API
Deployment.get_client()SyncHTTPClient.from_url()(cloud context)
Source
- get_client:
src/bentoml/_internal/cloud/deployment.py:L598-631 - SyncHTTPClient:
src/bentoml/_internal/client/http.py:L237-288
Import
from bentoml.client import SyncHTTPClient
Signature
# Via Deployment object
deployment.get_client() -> SyncHTTPClient
# Via direct URL
SyncHTTPClient.from_url(server_url: str) -> SyncHTTPClient
Key Difference from Local
When used in a cloud context, the SyncHTTPClient differs from local usage in these ways:
| Aspect | Local | Cloud |
|---|---|---|
| URL | http://localhost:3000 |
Deployment endpoint URL |
| Authentication | Not required | May include auth headers from BentoCloud context |
| Creation | SyncHTTPClient("http://localhost:3000") |
deployment.get_client() or SyncHTTPClient.from_url(url)
|
Inputs and Outputs
Inputs:
- Deployment object - When using
get_client(), the deployment's endpoint URL and authentication are handled automatically - Deployment endpoint URL - When using
SyncHTTPClient.from_url(), the cloud endpoint URL must be provided directly
Outputs:
- SyncHTTPClient instance with dynamically generated API methods matching the deployed service's endpoints
Usage Examples
Via Deployment Object
import bentoml
# Get an existing deployment
deployment = bentoml.deployment.get("my-llm-service")
# Get a client with auto-configured auth
client = deployment.get_client()
# Call the service endpoint
result = client.generate(prompt="Hello, world!", max_tokens=100)
print(result)
Via Direct URL
from bentoml.client import SyncHTTPClient
# Create client directly from cloud URL
client = SyncHTTPClient.from_url("https://my-llm-service-abc123.bentoml.com")
# Call the service endpoint
result = client.generate(prompt="Hello, world!", max_tokens=100)
print(result)
Within Create Flow
import bentoml
# Create deployment and immediately get a client
deployment = bentoml.deployment.create(
bento="my_service:latest",
scaling_min=1,
)
# Wait for deployment to be ready, then get client
client = deployment.get_client()
response = client.predict(input_data=[1.0, 2.0, 3.0])
Dynamic Method Generation
The client introspects the service's OpenAPI specification to generate methods. For a service defined as:
@bentoml.service
class MyService:
@bentoml.api
def predict(self, input_data: list[float]) -> float:
...
@bentoml.api
def classify(self, text: str) -> dict:
...
The client will have corresponding client.predict() and client.classify() methods with matching signatures.
Metadata
| Property | Value |
|---|---|
| Implementation | SyncHTTPClient For Cloud |
| API | Deployment.get_client() / SyncHTTPClient.from_url()
|
| Source | src/bentoml/_internal/cloud/deployment.py:L598-631, src/bentoml/_internal/client/http.py:L237-288
|
| Domain | ML_Serving, Cloud_Deployment, API_Design |
| Workflow | BentoCloud_Deployment |
| Principle | Principle:Bentoml_BentoML_Cloud_Endpoint_Invocation |