Implementation:Bentoml BentoML Bentoml Service Decorator

**Metadata**
Knowledge Sources	BentoML BentoML Service Decorator
Domains	ML_Serving Service_Definition
Last Updated	2026-02-13 15:00 GMT

Overview

Concrete decorator for transforming a plain Python class into a production-grade BentoML service. The @bentoml.service decorator is the primary entry point for defining ML serving endpoints in BentoML, wrapping user classes in a Service[T] instance that manages HTTP routing, resource allocation, and lifecycle.

Description

The @bentoml.service decorator inspects the target class for methods annotated with @bentoml.api, builds a route table, and wraps the class inside a Service[T] proxy. The decorator supports two invocation styles:

Bare decorator -- @bentoml.service with no arguments (applied directly to the class).
Parameterized decorator -- @bentoml.service(name="...", resources={...}, ...) returning a decorator that is then applied to the class.

Internally, the function checks whether its first positional argument is a class or None. If a class is provided, it wraps immediately; otherwise, it returns a partial that awaits the class.

Usage

Import and apply the decorator:

import bentoml

@bentoml.service(
    name="text-classifier",
    resources={"gpu": 1},
    traffic={"timeout": 60},
)
class TextClassifier:
    def __init__(self):
        self.model = load_my_model()

    @bentoml.api
    def classify(self, text: str) -> dict:
        return self.model.predict(text)

Code Reference

Source Location

Repository: bentoml/BentoML
File: src/_bentoml_sdk/service/factory.py (lines 532--606)

Signature

def service(
    inner: type[T] | None = None,
    /,
    *,
    name: str | None = None,
    image: Image | None = None,
    description: str | None = None,
    path_prefix: str | None = None,
    envs: list[ServiceEnvConfig] | None = None,
    labels: dict[str, str] | None = None,
    cmd: list[str] | None = None,
    service_class: type[Service[T]] = Service,
    **kwargs: Unpack[Config],
) -> Any

Import

import bentoml

# Used as:
@bentoml.service
class MyService: ...

# Or with parameters:
@bentoml.service(name="my-svc", resources={"gpu": 1})
class MyService: ...

I/O Contract

Inputs

**Input Contract**
Name	Type	Description
`inner`	None	The Python class to wrap. Provided implicitly when the decorator is used without parentheses; `None` when used with keyword arguments.
`name`	None	Custom service name. Defaults to the class name converted to a hyphenated slug.
`image`	None	Docker image configuration (base image, Python packages, system packages, etc.).
`description`	None	Human-readable description of the service.
`path_prefix`	None	URL path prefix for all routes in this service (e.g., `/v1`).
`envs`	None	Environment variables to set in the container at runtime.
`labels`	None	Key-value labels for metadata (used in BentoCloud and container registries).
`cmd`	None	Custom command for the container entry point.
`service_class`	type[Service[T]]	The Service wrapper class. Defaults to `Service`.
`**kwargs`	Unpack[Config]	Additional configuration conforming to `ServiceConfig` TypedDict, including `resources` (gpu, memory, cpu), `traffic` (timeout, concurrency, max_batch_size), and `workers` (count per service).

Outputs

**Output Contract**
Name	Type	Description
Return value	Service[T]	A `Service[T]` instance wrapping the original class. This object holds the route table, resource configuration, and acts as the entry point for both local development and production serving.

Usage Examples

Example 1: Minimal Service

A bare-bones service with no extra configuration.

import bentoml

@bentoml.service
class HealthCheck:
    @bentoml.api
    def ping(self) -> str:
        return "pong"

The decorator is applied without parentheses, so inner receives the class directly.
The service name defaults to "health-check".

Example 2: GPU Service with Image Config

A service with explicit resource requirements and a custom Docker image.

import bentoml
from bentoml.models import HuggingFaceModel

@bentoml.service(
    name="llm-service",
    image=bentoml.images.PythonImage(
        python_version="3.11",
        requirements_file="requirements.txt",
    ),
    resources={"gpu": 1, "memory": "16Gi"},
    traffic={"timeout": 300},
)
class LLMService:
    model_ref = HuggingFaceModel("meta-llama/Llama-2-7b-hf")

    def __init__(self):
        import transformers
        self.pipeline = transformers.pipeline(
            "text-generation", model=self.model_ref.resolve()
        )

    @bentoml.api
    def generate(self, prompt: str) -> str:
        return self.pipeline(prompt)[0]["generated_text"]

resources and traffic are passed through **kwargs and parsed as ServiceConfig.
image specifies the Docker build configuration used during bentoml build.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment