Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML Runnable Base

From Leeroopedia
Knowledge Sources
Domains Runner, ML Inference, Core Framework
Last Updated 2026-02-13 15:00 GMT

Overview

Defines the Runnable base class and supporting types for declaring computation units (runners) in BentoML, including method configuration for batching and streaming.

Description

The Runnable class is the abstract base that all BentoML runner implementations must extend. It provides a method static method (usable as a decorator) for registering methods on the runnable that can be invoked remotely. Each registered method is wrapped as a RunnableMethod descriptor, which stores the callable and its RunnableMethodConfig (batchable, batch_dim, input_spec, output_spec, is_stream). The class enforces that SUPPORTED_RESOURCES and SUPPORTS_CPU_MULTI_THREADING cannot be modified at runtime (via __setattr__) and prevents runtime access to add_method and method on instances (via __getattribute__). RunnableMethod uses the descriptor protocol (__get__, __set_name__) to bind methods to the runnable class and register them in the bentoml_runnable_methods__ dictionary. It supports both synchronous and asynchronous callables transparently.

Usage

Use this class to define custom BentoML runners. Subclass Runnable, declare SUPPORTED_RESOURCES and SUPPORTS_CPU_MULTI_THREADING, then use the @Runnable.method decorator to register inference methods.

Code Reference

Source Location

Signature

class Runnable:
    SUPPORTED_RESOURCES: tuple[str, ...]
    SUPPORTS_CPU_MULTI_THREADING: bool
    bentoml_runnable_methods__: dict[str, RunnableMethod[t.Any, t.Any, t.Any]] | None = None

    @classmethod
    def add_method(
        cls: t.Type[T],
        method: t.Callable[t.Concatenate[T, P], t.Any],
        name: str,
        *,
        batchable: bool = False,
        batch_dim: tuple[int, int] | int = 0,
        input_spec: LazyType[t.Any] | t.Tuple[LazyType[t.Any], ...] | None = None,
        output_spec: LazyType[t.Any] | None = None,
    ): ...

    @staticmethod
    def method(
        meth: t.Callable[...] | None = None,
        *,
        batchable: bool = False,
        batch_dim: tuple[int, int] | int = 0,
        input_spec: AnyType | tuple[AnyType, ...] | None = None,
        output_spec: AnyType | None = None,
    ) -> RunnableMethod | t.Callable[..., RunnableMethod]: ...

@attr.define
class RunnableMethod(t.Generic[T, P, R]):
    func: t.Callable[t.Concatenate[T, P], R]
    config: RunnableMethodConfig

@attr.define
class RunnableMethodConfig:
    batchable: bool
    batch_dim: tuple[int, int]
    input_spec: AnyType | tuple[AnyType, ...] | None = None
    output_spec: AnyType | None = None
    is_stream: bool = False

Import

from bentoml._internal.runner.runnable import Runnable, RunnableMethod, RunnableMethodConfig

I/O Contract

Inputs

Name Type Required Description
meth Callable or None No The method to register; if None, returns a decorator
batchable bool No Whether this method supports adaptive batching; defaults to False
batch_dim tuple[int, int] or int No Input and output batch dimensions; defaults to 0
input_spec AnyType or tuple or None No Expected input type specification
output_spec AnyType or None No Expected output type specification

Outputs

Name Type Description
RunnableMethod RunnableMethod[T, P, R] A descriptor wrapping the method with its configuration

Usage Examples

import bentoml
import numpy as np
from bentoml._internal.runner.runnable import Runnable

class MyModelRunnable(Runnable):
    SUPPORTED_RESOURCES = ("nvidia.com/gpu", "cpu")
    SUPPORTS_CPU_MULTI_THREADING = True

    def __init__(self):
        self.model = ...  # load model

    @Runnable.method(batchable=True, batch_dim=0)
    def predict(self, input_data: np.ndarray) -> np.ndarray:
        return self.model.predict(input_data)

    @Runnable.method(batchable=False)
    def explain(self, input_data: np.ndarray) -> dict:
        return self.model.explain(input_data)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment