Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bentoml BentoML InferenceAPI

From Leeroopedia
Knowledge Sources
Domains Service, API, Core Framework
Last Updated 2026-02-13 15:00 GMT

Overview

Defines the InferenceAPI class that represents a single API endpoint within a BentoML service, binding a user-defined callback function to input/output descriptors.

Description

The InferenceAPI class encapsulates an API endpoint with its input/output IO descriptors, user callback function, name, route, and documentation. During initialization it performs thorough validation: it checks that the API name is a valid Python identifier and not a reserved name (index, swagger, docs, metrics, healthz, livez, readyz), validates the route against illegal URL characters, and verifies type compatibility between the callback's type annotations and the IO descriptor types. It supports both single-input and multi-input (dict-based) API signatures, and optionally detects a context parameter (named context or ctx, or annotated with Context). The module also registers a custom YAML representer for InferenceAPI objects via _InferenceAPI_dumper, enabling YAML serialization of API metadata.

Usage

Use this class internally within BentoML's service construction pipeline to define and validate inference API endpoints. It is typically instantiated by the service builder when users decorate functions with @svc.api().

Code Reference

Source Location

Signature

RESERVED_API_NAMES = ["index", "swagger", "docs", "metrics", "healthz", "livez", "readyz"]

class InferenceAPI(t.Generic[IOType]):
    def __init__(
        self,
        user_defined_callback: t.Callable[..., t.Any] | None,
        input_descriptor: IODescriptor[IOType],
        output_descriptor: IODescriptor[IOType],
        name: str | None,
        doc: str | None = None,
        route: str | None = None,
    ): ...

    def __str__(self): ...

    @staticmethod
    def _validate_name(api_name: str): ...

    @staticmethod
    def validate_route(route: str): ...

def _InferenceAPI_dumper(dumper: yaml.Dumper, api: InferenceAPI[t.Any]) -> yaml.Node: ...

Import

from bentoml._internal.service.inference_api import InferenceAPI

I/O Contract

Inputs

Name Type Required Description
user_defined_callback Callable or None Yes The user's inference function; None creates a no-op
input_descriptor IODescriptor[IOType] Yes Describes the expected input format (e.g., NumpyNdarray, JSON)
output_descriptor IODescriptor[IOType] Yes Describes the expected output format
name str or None Yes API endpoint name; defaults to callback function name if None
doc str or None No Documentation string; defaults to callback docstring
route str or None No URL route for the API; defaults to the name

Outputs

Name Type Description
InferenceAPI instance InferenceAPI[IOType] Configured API object with validated name, route, input, output, and callback

Usage Examples

from bentoml._internal.service.inference_api import InferenceAPI
from bentoml._internal.io_descriptors import JSON

# Define an InferenceAPI with input and output descriptors
def predict(input_data: dict) -> dict:
    """Predict API endpoint"""
    return {"result": "predicted"}

api = InferenceAPI(
    user_defined_callback=predict,
    input_descriptor=JSON(),
    output_descriptor=JSON(),
    name="predict",
    route="/predict",
)
print(api)  # InferenceAPI(JSON -> JSON)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment