Implementation:Bentoml BentoML SyncHTTPClient From Url

**Metadata**
Knowledge Sources	BentoML BentoML Client Reference
Domains	ML_Serving Testing
Last Updated	2026-02-13 15:00 GMT

Overview

Concrete class method for creating a synchronous HTTP client that connects to a running BentoML service. The SyncHTTPClient.from_url class method fetches the service's OpenAPI schema and dynamically generates typed methods matching each API endpoint, enabling programmatic service invocation without manual HTTP request construction.

Description

The SyncHTTPClient.from_url class method performs the following steps:

Connect to the service -- establishes an HTTP connection to the specified server_url.
Fetch OpenAPI schema -- retrieves the service's API specification from the root endpoint.
Generate client methods -- dynamically creates Python methods on the client instance that correspond to each @bentoml.api endpoint on the service.
Return configured client -- returns a SyncHTTPClient instance ready for immediate use.

Each generated method handles request serialization, HTTP transport, and response deserialization transparently. The client supports the Python context manager protocol (with statement) for automatic resource cleanup.

An AsyncHTTPClient counterpart is also available for asyncio-based testing.

Usage

Import and create a client:

from bentoml.client import SyncHTTPClient

client = SyncHTTPClient.from_url("http://localhost:3000")
result = client.predict(text="Hello, world!")
client.close()

Or using a context manager:

with SyncHTTPClient.from_url("http://localhost:3000") as client:
    result = client.predict(text="Hello, world!")

Code Reference

Source Location

Repository: bentoml/BentoML
File: src/bentoml/_internal/client/http.py (lines 237--288)

Signature

@classmethod
def from_url(cls, server_url: str, **kwargs: Any) -> SyncHTTPClient

Import

from bentoml.client import SyncHTTPClient

# Also available at top level:
import bentoml
client = bentoml.SyncHTTPClient("http://localhost:3000")

I/O Contract

Inputs

**Input Contract**
Name	Type	Description
`server_url`	str	Base URL of the running BentoML service (e.g., `"http://localhost:3000"`).
`timeout`	int (keyword, via **kwargs)	Request timeout in seconds. Defaults to `300`.
`**kwargs`	Any	Additional keyword arguments passed to the underlying HTTP client (e.g., custom headers, authentication tokens).

Outputs

**Output Contract**
Name	Type	Description
Return value	SyncHTTPClient	A synchronous HTTP client instance with dynamically generated methods matching each API endpoint on the target service. Methods accept the same parameter types and return the same types as the service methods.

Usage Examples

Example 1: Basic Service Testing

Test a text classification service.

from bentoml.client import SyncHTTPClient

with SyncHTTPClient.from_url("http://localhost:3000") as client:
    # 'classify' method is auto-generated from the service's @bentoml.api
    result = client.classify(text="BentoML is great!")
    print(result)
    # {"label": "positive", "score": 0.98}

The classify method is dynamically generated from the service's OpenAPI schema.
Input and output types match the service's @bentoml.api method signature.

Example 2: Integration Test with Assertions

A pytest-style integration test.

import pytest
from bentoml.client import SyncHTTPClient

@pytest.fixture
def client():
    c = SyncHTTPClient.from_url("http://localhost:3000", timeout=60)
    yield c
    c.close()

def test_prediction_returns_valid_label(client):
    result = client.predict(text="test input")
    assert isinstance(result, dict)
    assert "label" in result
    assert result["label"] in ["positive", "negative", "neutral"]

def test_prediction_score_range(client):
    result = client.predict(text="another test")
    assert 0.0 <= result["score"] <= 1.0

The client is created once per test session via a pytest fixture.
Each test calls auto-generated methods and asserts on the response structure.

Example 3: Async Client Alternative

Using the async client for concurrent testing.

import asyncio
from bentoml.client import AsyncHTTPClient

async def test_concurrent_requests():
    async with AsyncHTTPClient.from_url("http://localhost:3000") as client:
        tasks = [client.predict(text=f"input_{i}") for i in range(100)]
        results = await asyncio.gather(*tasks)
        assert len(results) == 100
        assert all("label" in r for r in results)

asyncio.run(test_concurrent_requests())

AsyncHTTPClient enables concurrent request dispatch for load testing.
The asyncio.gather call sends 100 requests in parallel.

Related Pages

Principle:Bentoml_BentoML_HTTP_Client_Testing

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment