Heuristic:Bentoml BentoML Warning Deprecated Runner Class
| Knowledge Sources | |
|---|---|
| Domains | Deprecation, Migration |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
Deprecation warning: the Runner class is deprecated in favor of new-style services using @bentoml.service() with direct model loading.
Description
The Runner class in src/bentoml/_internal/runner/runner.py is decorated with @deprecated(suggestion="Please upgrade to new style services."). Runners were BentoML v1.x's mechanism for wrapping model inference into remotely-callable units with automatic batching and resource management. In v2.x, these capabilities are built directly into the service architecture.
Usage
Be aware of this deprecation when encountering bentoml.Runner, runner = bentoml.pytorch.get("model").to_runner(), or runners=[runner] patterns. New code should load models directly in service classes and use bentoml.depends() for composition.
The Insight (Rule of Thumb)
- Action: Replace runner = model.to_runner() with direct model loading inside @bentoml.service() classes.
- Value: New-style services provide the same batching capabilities (via @bentoml.api(batchable=True)) without the Runner abstraction layer.
- Trade-off: Runner provided automatic inter-process communication; new-style services use bentoml.depends() for the same distributed capabilities with less boilerplate.
Reasoning
The Runner abstraction added complexity by requiring users to create Runner objects, register them with services, and use runner.async_run() for inference. The new architecture simplifies this by allowing direct model usage within service methods, while bentoml.depends() provides the same distributed execution capabilities when needed.