Implementation:Bentoml BentoML InferenceAPI
| Knowledge Sources | |
|---|---|
| Domains | Service, API, Core Framework |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Defines the InferenceAPI class that represents a single API endpoint within a BentoML service, binding a user-defined callback function to input/output descriptors.
Description
The InferenceAPI class encapsulates an API endpoint with its input/output IO descriptors, user callback function, name, route, and documentation. During initialization it performs thorough validation: it checks that the API name is a valid Python identifier and not a reserved name (index, swagger, docs, metrics, healthz, livez, readyz), validates the route against illegal URL characters, and verifies type compatibility between the callback's type annotations and the IO descriptor types. It supports both single-input and multi-input (dict-based) API signatures, and optionally detects a context parameter (named context or ctx, or annotated with Context). The module also registers a custom YAML representer for InferenceAPI objects via _InferenceAPI_dumper, enabling YAML serialization of API metadata.
Usage
Use this class internally within BentoML's service construction pipeline to define and validate inference API endpoints. It is typically instantiated by the service builder when users decorate functions with @svc.api().
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/service/inference_api.py
- Lines: 1-192
Signature
RESERVED_API_NAMES = ["index", "swagger", "docs", "metrics", "healthz", "livez", "readyz"]
class InferenceAPI(t.Generic[IOType]):
def __init__(
self,
user_defined_callback: t.Callable[..., t.Any] | None,
input_descriptor: IODescriptor[IOType],
output_descriptor: IODescriptor[IOType],
name: str | None,
doc: str | None = None,
route: str | None = None,
): ...
def __str__(self): ...
@staticmethod
def _validate_name(api_name: str): ...
@staticmethod
def validate_route(route: str): ...
def _InferenceAPI_dumper(dumper: yaml.Dumper, api: InferenceAPI[t.Any]) -> yaml.Node: ...
Import
from bentoml._internal.service.inference_api import InferenceAPI
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| user_defined_callback | Callable or None | Yes | The user's inference function; None creates a no-op |
| input_descriptor | IODescriptor[IOType] | Yes | Describes the expected input format (e.g., NumpyNdarray, JSON) |
| output_descriptor | IODescriptor[IOType] | Yes | Describes the expected output format |
| name | str or None | Yes | API endpoint name; defaults to callback function name if None |
| doc | str or None | No | Documentation string; defaults to callback docstring |
| route | str or None | No | URL route for the API; defaults to the name |
Outputs
| Name | Type | Description |
|---|---|---|
| InferenceAPI instance | InferenceAPI[IOType] | Configured API object with validated name, route, input, output, and callback |
Usage Examples
from bentoml._internal.service.inference_api import InferenceAPI
from bentoml._internal.io_descriptors import JSON
# Define an InferenceAPI with input and output descriptors
def predict(input_data: dict) -> dict:
"""Predict API endpoint"""
return {"result": "predicted"}
api = InferenceAPI(
user_defined_callback=predict,
input_descriptor=JSON(),
output_descriptor=JSON(),
name="predict",
route="/predict",
)
print(api) # InferenceAPI(JSON -> JSON)