Implementation:Bentoml BentoML Runner Container
| Knowledge Sources | |
|---|---|
| Domains | Runner, Data Serialization, Batching |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
The Runner Container module provides the data container abstraction layer for serializing, deserializing, batching, and splitting runner data payloads across different data types including NumPy arrays, Pandas DataFrames, PIL images, and Triton inference inputs.
Description
This module implements a type-driven container system for BentoML's runner infrastructure. The core abstractions and implementations are:
Payload (NamedTuple): The fundamental transport unit containing:
data: Serialized bytes.meta: Metadata dictionary with format info, buffer indices, and pickle state.container: Name of the container class that produced the payload.batch_size: Number of items in the batch (defaults to -1).
DataContainer[SingleType, BatchType] (Generic ABC): The abstract base class defining the container interface:
to_payload/from_payload: Serialize/deserialize individual batches.batches_to_batch/batch_to_batches: Merge multiple batches into one (with index tracking) and split a batch back out.batch_to_payloads/from_batch_payloads: Convenience methods that compose the above.to_triton_payload/to_triton_grpc_payload/to_triton_http_payload: Convert data for Triton Inference Server clients.
Concrete container implementations:
- NdarrayContainer: Handles
numpy.ndarray. Usesnp.concatenate/np.splitfor batching. Serialization employs PEP 574 pickle with out-of-band buffers for efficiency (with a fallback to standard pickle for small payloads > 6140 bytes). Supports Triton gRPC and HTTP payload conversion. - PandasDataFrameContainer: Handles
pandas.DataFrameandpandas.Series. Usespd.concatfor batching, only supportsbatch_dim=0. Same PEP 574 pickle optimization as NdarrayContainer. - PILImageContainer: Handles
PIL.Image.Image. Serializes images to bytes viaImage.save. Batching is not supported (raisesNotImplementedError). - TritonInferInputDataContainer: Pass-through container for Triton
InferInputobjects. Batching not supported. - DefaultContainer: Fallback for arbitrary Python objects. Uses standard pickle serialization and list-based batching.
- ParamsContainer: Handles
Params[Payload]for multi-parameter runner methods. Delegates toAutoContainerfor actual data handling. - PayloadContainer: Handles raw
Payloadobjects by delegating to the container identified by the payload'scontainerfield.
DataContainerRegistry: A class-level registry that maps LazyType references to container classes. Populated at module load time by register_builtin_containers() which registers numpy, pandas, Triton, PIL, Params, and Payload types.
AutoContainer: The main entry point that dynamically dispatches to the appropriate container based on the runtime type of the data, using DataContainerRegistry.find_by_batch_type or find_by_name.
Usage
This module is used internally by BentoML's runner infrastructure to serialize data between the API server and runner workers, and to implement adaptive batching. Users typically do not interact with it directly.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/runner/container.py
- Lines: 1-786
Signature
class Payload(t.NamedTuple):
data: bytes
meta: dict[str, bool | int | float | str | list[int]]
container: str
batch_size: int = -1
class DataContainer(t.Generic[SingleType, BatchType]):
@classmethod
def to_payload(cls, batch: BatchType, batch_dim: int) -> Payload: ...
@classmethod
def from_payload(cls, payload: Payload) -> BatchType: ...
@classmethod
def batches_to_batch(cls, batches: t.Sequence[SingleType], batch_dim: int) -> tuple[BatchType, list[int]]: ...
@classmethod
def batch_to_batches(cls, batch: BatchType, indices: t.Sequence[int], batch_dim: int) -> list[SingleType]: ...
class NdarrayContainer(DataContainer["ext.NpNDArray", "ext.NpNDArray"]): ...
class PandasDataFrameContainer(DataContainer[t.Union["ext.PdDataFrame", "ext.PdSeries"], "ext.PdDataFrame"]): ...
class PILImageContainer(DataContainer["ext.PILImage", "ext.PILImage"]): ...
class DefaultContainer(DataContainer[t.Any, t.List[t.Any]]): ...
class AutoContainer(DataContainer[t.Any, t.Any]): ...
class DataContainerRegistry:
@classmethod
def register_container(cls, single_type, batch_type, container_cls): ...
@classmethod
def find_by_single_type(cls, type_) -> t.Type[DataContainer]: ...
@classmethod
def find_by_batch_type(cls, type_) -> t.Type[DataContainer]: ...
@classmethod
def find_by_name(cls, name: str) -> t.Type[DataContainer]: ...
Import
from bentoml._internal.runner.container import AutoContainer
from bentoml._internal.runner.container import Payload
from bentoml._internal.runner.container import DataContainerRegistry
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| batch | BatchType | Yes | The data to serialize (numpy array, pandas DataFrame, PIL Image, or any Python object). |
| batch_dim | int | Yes | The dimension along which batching occurs (typically 0). |
Outputs
| Name | Type | Description |
|---|---|---|
| Payload | Payload | A NamedTuple containing serialized bytes, metadata, container name, and batch size. |
| (batch, indices) | tuple[BatchType, list[int]] | When merging batches, returns the combined batch and index boundaries. |
Usage Examples
import numpy as np
from bentoml._internal.runner.container import AutoContainer, Payload
# Serialize a numpy array to a Payload
arr = np.array([[1.0, 2.0], [3.0, 4.0]])
payload = AutoContainer.to_payload(arr, batch_dim=0)
# Deserialize back
restored = AutoContainer.from_payload(payload)
# Batch multiple payloads together
payload1 = AutoContainer.to_payload(np.array([[1, 2]]), batch_dim=0)
payload2 = AutoContainer.to_payload(np.array([[3, 4]]), batch_dim=0)
combined, indices = AutoContainer.from_batch_payloads([payload1, payload2], batch_dim=0)
# combined: array([[1, 2], [3, 4]])
# indices: [0, 1, 2]