Implementation:Bentoml BentoML InferenceAPI

Knowledge Sources	Bentoml_BentoML
Domains	Service, API, Core Framework
Last Updated	2026-02-13 15:00 GMT

Overview

Defines the InferenceAPI class that represents a single API endpoint within a BentoML service, binding a user-defined callback function to input/output descriptors.

Description

The InferenceAPI class encapsulates an API endpoint with its input/output IO descriptors, user callback function, name, route, and documentation. During initialization it performs thorough validation: it checks that the API name is a valid Python identifier and not a reserved name (index, swagger, docs, metrics, healthz, livez, readyz), validates the route against illegal URL characters, and verifies type compatibility between the callback's type annotations and the IO descriptor types. It supports both single-input and multi-input (dict-based) API signatures, and optionally detects a context parameter (named context or ctx, or annotated with Context). The module also registers a custom YAML representer for InferenceAPI objects via _InferenceAPI_dumper, enabling YAML serialization of API metadata.

Usage

Use this class internally within BentoML's service construction pipeline to define and validate inference API endpoints. It is typically instantiated by the service builder when users decorate functions with @svc.api().

Code Reference

Source Location

Repository: Bentoml_BentoML
File: src/bentoml/_internal/service/inference_api.py
Lines: 1-192

Signature

RESERVED_API_NAMES = ["index", "swagger", "docs", "metrics", "healthz", "livez", "readyz"]

class InferenceAPI(t.Generic[IOType]):
    def __init__(
        self,
        user_defined_callback: t.Callable[..., t.Any] | None,
        input_descriptor: IODescriptor[IOType],
        output_descriptor: IODescriptor[IOType],
        name: str | None,
        doc: str | None = None,
        route: str | None = None,
    ): ...

    def __str__(self): ...

    @staticmethod
    def _validate_name(api_name: str): ...

    @staticmethod
    def validate_route(route: str): ...

def _InferenceAPI_dumper(dumper: yaml.Dumper, api: InferenceAPI[t.Any]) -> yaml.Node: ...

Import

from bentoml._internal.service.inference_api import InferenceAPI

I/O Contract

Inputs

Name	Type	Required	Description
user_defined_callback	Callable or None	Yes	The user's inference function; None creates a no-op
input_descriptor	IODescriptor[IOType]	Yes	Describes the expected input format (e.g., NumpyNdarray, JSON)
output_descriptor	IODescriptor[IOType]	Yes	Describes the expected output format
name	str or None	Yes	API endpoint name; defaults to callback function name if None
doc	str or None	No	Documentation string; defaults to callback docstring
route	str or None	No	URL route for the API; defaults to the name

Outputs

Name	Type	Description
InferenceAPI instance	InferenceAPI[IOType]	Configured API object with validated name, route, input, output, and callback

Usage Examples

from bentoml._internal.service.inference_api import InferenceAPI
from bentoml._internal.io_descriptors import JSON

# Define an InferenceAPI with input and output descriptors
def predict(input_data: dict) -> dict:
    """Predict API endpoint"""
    return {"result": "predicted"}

api = InferenceAPI(
    user_defined_callback=predict,
    input_descriptor=JSON(),
    output_descriptor=JSON(),
    name="predict",
    route="/predict",
)
print(api)  # InferenceAPI(JSON -> JSON)

Related Pages

Implementation:Bentoml_BentoML_Exceptions

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment