Implementation:Bentoml BentoML Legacy Service
| Knowledge Sources | |
|---|---|
| Domains | Service Definition, API Specification, Model Serving |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
The Legacy Service module defines the Service class that is the core building block of BentoML's legacy (pre-2.0) service-oriented architecture, providing API registration, lifecycle hooks, ASGI/WSGI mounting, and gRPC support.
Description
The Service class (decorated with @deprecated suggesting upgrade to @bentoml.service()) is an attrs-based class that represents a deployable BentoML service. Key features include:
Core attributes:
name: Validated service name (lowercased, must be a valid BentoML Tag).runners: List ofRunneror_TritonRunnerinstances, validated for uniqueness.models: List ofModelinstances.apis: Dictionary ofInferenceAPIinstances registered via the@svc.apidecorator.tag/bento: Set when loaded from a Bento artifact.context: AServiceContextinstance for service-level state.
API registration: The api() method is a decorator factory that creates InferenceAPI instances from IO descriptors and user-defined callback functions. Supports custom routes, names, and documentation strings. Prevents duplicate API names.
Lifecycle hooks:
on_startup(func): Registers a function to run when the service starts. ReceivesServiceContext.on_shutdown(func): Registers a function to run when the service stops. ReceivesServiceContext.on_deployment(func): Registers a function to run on deployment.
ASGI/WSGI support:
asgi_appproperty: Returns an ASGI application viaHTTPAppFactory.mount_asgi_app(app, path, name): Mounts an ASGI application at a path.mount_wsgi_app(app, path, name): Wraps a WSGI app witha2wsgi.WSGIMiddlewareand mounts it.add_asgi_middleware(middleware_cls, **options): Adds ASGI middleware.
gRPC support:
grpc_servicerproperty /get_grpc_servicer(protocol_version): Returns a gRPC servicer.mount_grpc_servicer: Mounts additional gRPC servicers.add_grpc_interceptor: Adds gRPC interceptors with validation.add_grpc_handlers: Adds generic RPC handlers.
Serialization (pickle): The __reduce__ method supports three serialization strategies:
EXPORT_BENTO: Exports and re-imports the Bento via a temporary file.LOCAL_BENTO: Loads from the local Bento store by tag.REMOTE_BENTO: Pulls from BentoCloud if not found locally.- Falls back to
import_servicefor source-based services.
OpenAPI: The openapi_spec property generates an OpenAPI specification via generate_spec.
Usage
Used to define BentoML services in the legacy architecture. Users instantiate Service, register runners, and use the @svc.api decorator to define inference endpoints. This class is deprecated in favor of the new @bentoml.service() decorator.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/service/service.py
- Lines: 1-487
Signature
@deprecated("bentoml.legacy.Service", suggestion="Please upgrade to @bentoml.service().")
@attr.define(frozen=False, init=False)
class Service:
name: str
runners: t.List[Runner | _TritonRunner]
models: t.List[Model]
apis: t.Dict[str, InferenceAPI[t.Any]]
tag: Tag | None
bento: Bento | None
context: Context
def __init__(
self,
name: str,
*,
runners: list[AbstractRunner] | None = None,
models: list[Model] | None = None,
): ...
def api(
self,
input: IODescriptor[IOType],
output: IODescriptor[IOType],
*,
name: str | None = None,
doc: str | None = None,
route: str | None = None,
) -> _inference_api_wrapper[IOType]: ...
def on_startup(self, func: HookF_ctx) -> HookF_ctx: ...
def on_shutdown(self, func: HookF_ctx) -> HookF_ctx: ...
def on_deployment(self, func: HookF) -> HookF: ...
def mount_asgi_app(self, app, path="/", name=None) -> None: ...
def mount_wsgi_app(self, app, path="/", name=None) -> None: ...
def add_asgi_middleware(self, middleware_cls, **options) -> None: ...
def get_grpc_servicer(self, protocol_version=...) -> services.BentoServiceServicer: ...
def on_load_bento(self, bento: Bento) -> None: ...
def get_service_import_origin(self) -> tuple[str, str]: ...
Import
import bentoml
svc = bentoml.legacy.Service("my-service", runners=[runner])
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | Service name. Will be lowercased and validated as a BentoML Tag. |
| runners | list[AbstractRunner] or None | No | List of Runner instances. Names must be unique. |
| models | list[Model] or None | No | List of BentoML Model instances. |
Outputs
| Name | Type | Description |
|---|---|---|
| Service | Service | A configured Service instance with API endpoints accessible via the apis dictionary. |
Usage Examples
from __future__ import annotations
from typing import TYPE_CHECKING, Any
import bentoml
from bentoml.io import NumpyNdarray, JSON
if TYPE_CHECKING:
from numpy.typing import NDArray
runner = bentoml.sklearn.get("iris_clf:latest").to_runner()
svc = bentoml.legacy.Service("iris-classifier", runners=[runner])
@svc.api(input=NumpyNdarray(), output=NumpyNdarray())
def predict(input_arr: NDArray[Any]) -> NDArray[Any]:
return runner.run(input_arr)
# Lifecycle hooks
@svc.on_startup
async def startup(ctx):
print("Service starting up")
@svc.on_shutdown
async def shutdown(ctx):
print("Service shutting down")
# Mount an ASGI app
from starlette.applications import Starlette
from starlette.responses import PlainTextResponse
app = Starlette()
@app.route("/health")
async def health(request):
return PlainTextResponse("OK")
svc.mount_asgi_app(app, path="/custom")
Related Pages
- Heuristic:Bentoml_BentoML_Warning_Deprecated_Legacy_Service
- Implementation:Bentoml_BentoML_Service_Loader
- Implementation:Bentoml_BentoML_Runner_Class
- Implementation:Bentoml_BentoML_IO_Descriptor_JSON
- Implementation:Bentoml_BentoML_IO_Descriptor_NumpyNdarray
- Implementation:Bentoml_BentoML_IO_Descriptor_Image