Implementation:Bentoml BentoML Runnable Base
| Knowledge Sources | |
|---|---|
| Domains | Runner, ML Inference, Core Framework |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
Defines the Runnable base class and supporting types for declaring computation units (runners) in BentoML, including method configuration for batching and streaming.
Description
The Runnable class is the abstract base that all BentoML runner implementations must extend. It provides a method static method (usable as a decorator) for registering methods on the runnable that can be invoked remotely. Each registered method is wrapped as a RunnableMethod descriptor, which stores the callable and its RunnableMethodConfig (batchable, batch_dim, input_spec, output_spec, is_stream). The class enforces that SUPPORTED_RESOURCES and SUPPORTS_CPU_MULTI_THREADING cannot be modified at runtime (via __setattr__) and prevents runtime access to add_method and method on instances (via __getattribute__). RunnableMethod uses the descriptor protocol (__get__, __set_name__) to bind methods to the runnable class and register them in the bentoml_runnable_methods__ dictionary. It supports both synchronous and asynchronous callables transparently.
Usage
Use this class to define custom BentoML runners. Subclass Runnable, declare SUPPORTED_RESOURCES and SUPPORTS_CPU_MULTI_THREADING, then use the @Runnable.method decorator to register inference methods.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/runner/runnable.py
- Lines: 1-172
Signature
class Runnable:
SUPPORTED_RESOURCES: tuple[str, ...]
SUPPORTS_CPU_MULTI_THREADING: bool
bentoml_runnable_methods__: dict[str, RunnableMethod[t.Any, t.Any, t.Any]] | None = None
@classmethod
def add_method(
cls: t.Type[T],
method: t.Callable[t.Concatenate[T, P], t.Any],
name: str,
*,
batchable: bool = False,
batch_dim: tuple[int, int] | int = 0,
input_spec: LazyType[t.Any] | t.Tuple[LazyType[t.Any], ...] | None = None,
output_spec: LazyType[t.Any] | None = None,
): ...
@staticmethod
def method(
meth: t.Callable[...] | None = None,
*,
batchable: bool = False,
batch_dim: tuple[int, int] | int = 0,
input_spec: AnyType | tuple[AnyType, ...] | None = None,
output_spec: AnyType | None = None,
) -> RunnableMethod | t.Callable[..., RunnableMethod]: ...
@attr.define
class RunnableMethod(t.Generic[T, P, R]):
func: t.Callable[t.Concatenate[T, P], R]
config: RunnableMethodConfig
@attr.define
class RunnableMethodConfig:
batchable: bool
batch_dim: tuple[int, int]
input_spec: AnyType | tuple[AnyType, ...] | None = None
output_spec: AnyType | None = None
is_stream: bool = False
Import
from bentoml._internal.runner.runnable import Runnable, RunnableMethod, RunnableMethodConfig
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| meth | Callable or None | No | The method to register; if None, returns a decorator |
| batchable | bool | No | Whether this method supports adaptive batching; defaults to False |
| batch_dim | tuple[int, int] or int | No | Input and output batch dimensions; defaults to 0 |
| input_spec | AnyType or tuple or None | No | Expected input type specification |
| output_spec | AnyType or None | No | Expected output type specification |
Outputs
| Name | Type | Description |
|---|---|---|
| RunnableMethod | RunnableMethod[T, P, R] | A descriptor wrapping the method with its configuration |
Usage Examples
import bentoml
import numpy as np
from bentoml._internal.runner.runnable import Runnable
class MyModelRunnable(Runnable):
SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
SUPPORTS_CPU_MULTI_THREADING = True
def __init__(self):
self.model = ... # load model
@Runnable.method(batchable=True, batch_dim=0)
def predict(self, input_data: np.ndarray) -> np.ndarray:
return self.model.predict(input_data)
@Runnable.method(batchable=False)
def explain(self, input_data: np.ndarray) -> dict:
return self.model.explain(input_data)