Heuristic:Bentoml BentoML Warning Deprecated Runner Class

Knowledge Sources	BentoML
Domains	Deprecation, Migration
Last Updated	2026-02-13 00:00 GMT

Overview

Deprecation warning: the Runner class is deprecated in favor of new-style services using @bentoml.service() with direct model loading.

Description

The Runner class in src/bentoml/_internal/runner/runner.py is decorated with @deprecated(suggestion="Please upgrade to new style services."). Runners were BentoML v1.x's mechanism for wrapping model inference into remotely-callable units with automatic batching and resource management. In v2.x, these capabilities are built directly into the service architecture.

Usage

Be aware of this deprecation when encountering bentoml.Runner, runner = bentoml.pytorch.get("model").to_runner(), or runners=[runner] patterns. New code should load models directly in service classes and use bentoml.depends() for composition.

The Insight (Rule of Thumb)

Action: Replace runner = model.to_runner() with direct model loading inside @bentoml.service() classes.
Value: New-style services provide the same batching capabilities (via @bentoml.api(batchable=True)) without the Runner abstraction layer.
Trade-off: Runner provided automatic inter-process communication; new-style services use bentoml.depends() for the same distributed capabilities with less boilerplate.

Reasoning

The Runner abstraction added complexity by requiring users to create Runner objects, register them with services, and use runner.async_run() for inference. The new architecture simplifies this by allowing direct model usage within service methods, while bentoml.depends() provides the same distributed execution capabilities when needed.

Related Pages

Implementation:Bentoml_BentoML_Runner_Class

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment