Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML Bentoml Service Decorator

From Leeroopedia
Metadata
Knowledge Sources
Domains
Last Updated 2026-02-13 15:00 GMT

Overview

Concrete decorator for transforming a plain Python class into a production-grade BentoML service. The @bentoml.service decorator is the primary entry point for defining ML serving endpoints in BentoML, wrapping user classes in a Service[T] instance that manages HTTP routing, resource allocation, and lifecycle.

Description

The @bentoml.service decorator inspects the target class for methods annotated with @bentoml.api, builds a route table, and wraps the class inside a Service[T] proxy. The decorator supports two invocation styles:

  • Bare decorator -- @bentoml.service with no arguments (applied directly to the class).
  • Parameterized decorator -- @bentoml.service(name="...", resources={...}, ...) returning a decorator that is then applied to the class.

Internally, the function checks whether its first positional argument is a class or None. If a class is provided, it wraps immediately; otherwise, it returns a partial that awaits the class.

Usage

Import and apply the decorator:

import bentoml

@bentoml.service(
    name="text-classifier",
    resources={"gpu": 1},
    traffic={"timeout": 60},
)
class TextClassifier:
    def __init__(self):
        self.model = load_my_model()

    @bentoml.api
    def classify(self, text: str) -> dict:
        return self.model.predict(text)

Code Reference

Source Location

  • Repository: bentoml/BentoML
  • File: src/_bentoml_sdk/service/factory.py (lines 532--606)

Signature

def service(
    inner: type[T] | None = None,
    /,
    *,
    name: str | None = None,
    image: Image | None = None,
    description: str | None = None,
    path_prefix: str | None = None,
    envs: list[ServiceEnvConfig] | None = None,
    labels: dict[str, str] | None = None,
    cmd: list[str] | None = None,
    service_class: type[Service[T]] = Service,
    **kwargs: Unpack[Config],
) -> Any

Import

import bentoml

# Used as:
@bentoml.service
class MyService: ...

# Or with parameters:
@bentoml.service(name="my-svc", resources={"gpu": 1})
class MyService: ...

I/O Contract

Inputs

Input Contract
Name Type Description
inner None The Python class to wrap. Provided implicitly when the decorator is used without parentheses; None when used with keyword arguments.
name None Custom service name. Defaults to the class name converted to a hyphenated slug.
image None Docker image configuration (base image, Python packages, system packages, etc.).
description None Human-readable description of the service.
path_prefix None URL path prefix for all routes in this service (e.g., /v1).
envs None Environment variables to set in the container at runtime.
labels None Key-value labels for metadata (used in BentoCloud and container registries).
cmd None Custom command for the container entry point.
service_class type[Service[T]] The Service wrapper class. Defaults to Service.
**kwargs Unpack[Config] Additional configuration conforming to ServiceConfig TypedDict, including resources (gpu, memory, cpu), traffic (timeout, concurrency, max_batch_size), and workers (count per service).

Outputs

Output Contract
Name Type Description
Return value Service[T] A Service[T] instance wrapping the original class. This object holds the route table, resource configuration, and acts as the entry point for both local development and production serving.

Usage Examples

Example 1: Minimal Service

A bare-bones service with no extra configuration.

import bentoml

@bentoml.service
class HealthCheck:
    @bentoml.api
    def ping(self) -> str:
        return "pong"
  • The decorator is applied without parentheses, so inner receives the class directly.
  • The service name defaults to "health-check".

Example 2: GPU Service with Image Config

A service with explicit resource requirements and a custom Docker image.

import bentoml
from bentoml.models import HuggingFaceModel

@bentoml.service(
    name="llm-service",
    image=bentoml.images.PythonImage(
        python_version="3.11",
        requirements_file="requirements.txt",
    ),
    resources={"gpu": 1, "memory": "16Gi"},
    traffic={"timeout": 300},
)
class LLMService:
    model_ref = HuggingFaceModel("meta-llama/Llama-2-7b-hf")

    def __init__(self):
        import transformers
        self.pipeline = transformers.pipeline(
            "text-generation", model=self.model_ref.resolve()
        )

    @bentoml.api
    def generate(self, prompt: str) -> str:
        return self.pipeline(prompt)[0]["generated_text"]
  • resources and traffic are passed through **kwargs and parsed as ServiceConfig.
  • image specifies the Docker build configuration used during bentoml build.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment