Implementation:Bentoml BentoML SyncHTTPClient From Url
| Knowledge Sources | |
|---|---|
| Domains | |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Concrete class method for creating a synchronous HTTP client that connects to a running BentoML service. The SyncHTTPClient.from_url class method fetches the service's OpenAPI schema and dynamically generates typed methods matching each API endpoint, enabling programmatic service invocation without manual HTTP request construction.
Description
The SyncHTTPClient.from_url class method performs the following steps:
- Connect to the service -- establishes an HTTP connection to the specified
server_url. - Fetch OpenAPI schema -- retrieves the service's API specification from the root endpoint.
- Generate client methods -- dynamically creates Python methods on the client instance that correspond to each
@bentoml.apiendpoint on the service. - Return configured client -- returns a
SyncHTTPClientinstance ready for immediate use.
Each generated method handles request serialization, HTTP transport, and response deserialization transparently. The client supports the Python context manager protocol (with statement) for automatic resource cleanup.
An AsyncHTTPClient counterpart is also available for asyncio-based testing.
Usage
Import and create a client:
from bentoml.client import SyncHTTPClient
client = SyncHTTPClient.from_url("http://localhost:3000")
result = client.predict(text="Hello, world!")
client.close()
Or using a context manager:
with SyncHTTPClient.from_url("http://localhost:3000") as client:
result = client.predict(text="Hello, world!")
Code Reference
Source Location
- Repository:
bentoml/BentoML - File:
src/bentoml/_internal/client/http.py(lines 237--288)
Signature
@classmethod
def from_url(cls, server_url: str, **kwargs: Any) -> SyncHTTPClient
Import
from bentoml.client import SyncHTTPClient
# Also available at top level:
import bentoml
client = bentoml.SyncHTTPClient("http://localhost:3000")
I/O Contract
Inputs
| Name | Type | Description |
|---|---|---|
server_url |
str | Base URL of the running BentoML service (e.g., "http://localhost:3000").
|
timeout |
int (keyword, via **kwargs) | Request timeout in seconds. Defaults to 300.
|
**kwargs |
Any | Additional keyword arguments passed to the underlying HTTP client (e.g., custom headers, authentication tokens). |
Outputs
| Name | Type | Description |
|---|---|---|
| Return value | SyncHTTPClient | A synchronous HTTP client instance with dynamically generated methods matching each API endpoint on the target service. Methods accept the same parameter types and return the same types as the service methods. |
Usage Examples
Example 1: Basic Service Testing
Test a text classification service.
from bentoml.client import SyncHTTPClient
with SyncHTTPClient.from_url("http://localhost:3000") as client:
# 'classify' method is auto-generated from the service's @bentoml.api
result = client.classify(text="BentoML is great!")
print(result)
# {"label": "positive", "score": 0.98}
- The
classifymethod is dynamically generated from the service's OpenAPI schema. - Input and output types match the service's
@bentoml.apimethod signature.
Example 2: Integration Test with Assertions
A pytest-style integration test.
import pytest
from bentoml.client import SyncHTTPClient
@pytest.fixture
def client():
c = SyncHTTPClient.from_url("http://localhost:3000", timeout=60)
yield c
c.close()
def test_prediction_returns_valid_label(client):
result = client.predict(text="test input")
assert isinstance(result, dict)
assert "label" in result
assert result["label"] in ["positive", "negative", "neutral"]
def test_prediction_score_range(client):
result = client.predict(text="another test")
assert 0.0 <= result["score"] <= 1.0
- The client is created once per test session via a pytest fixture.
- Each test calls auto-generated methods and asserts on the response structure.
Example 3: Async Client Alternative
Using the async client for concurrent testing.
import asyncio
from bentoml.client import AsyncHTTPClient
async def test_concurrent_requests():
async with AsyncHTTPClient.from_url("http://localhost:3000") as client:
tasks = [client.predict(text=f"input_{i}") for i in range(100)]
results = await asyncio.gather(*tasks)
assert len(results) == 100
assert all("label" in r for r in results)
asyncio.run(test_concurrent_requests())
AsyncHTTPClientenables concurrent request dispatch for load testing.- The
asyncio.gathercall sends 100 requests in parallel.