Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML Runner Container

From Leeroopedia
Knowledge Sources
Domains Runner, Data Serialization, Batching
Last Updated 2026-02-13 15:00 GMT

Overview

The Runner Container module provides the data container abstraction layer for serializing, deserializing, batching, and splitting runner data payloads across different data types including NumPy arrays, Pandas DataFrames, PIL images, and Triton inference inputs.

Description

This module implements a type-driven container system for BentoML's runner infrastructure. The core abstractions and implementations are:

Payload (NamedTuple): The fundamental transport unit containing:

  • data: Serialized bytes.
  • meta: Metadata dictionary with format info, buffer indices, and pickle state.
  • container: Name of the container class that produced the payload.
  • batch_size: Number of items in the batch (defaults to -1).

DataContainer[SingleType, BatchType] (Generic ABC): The abstract base class defining the container interface:

  • to_payload / from_payload: Serialize/deserialize individual batches.
  • batches_to_batch / batch_to_batches: Merge multiple batches into one (with index tracking) and split a batch back out.
  • batch_to_payloads / from_batch_payloads: Convenience methods that compose the above.
  • to_triton_payload / to_triton_grpc_payload / to_triton_http_payload: Convert data for Triton Inference Server clients.

Concrete container implementations:

  • NdarrayContainer: Handles numpy.ndarray. Uses np.concatenate/np.split for batching. Serialization employs PEP 574 pickle with out-of-band buffers for efficiency (with a fallback to standard pickle for small payloads > 6140 bytes). Supports Triton gRPC and HTTP payload conversion.
  • PandasDataFrameContainer: Handles pandas.DataFrame and pandas.Series. Uses pd.concat for batching, only supports batch_dim=0. Same PEP 574 pickle optimization as NdarrayContainer.
  • PILImageContainer: Handles PIL.Image.Image. Serializes images to bytes via Image.save. Batching is not supported (raises NotImplementedError).
  • TritonInferInputDataContainer: Pass-through container for Triton InferInput objects. Batching not supported.
  • DefaultContainer: Fallback for arbitrary Python objects. Uses standard pickle serialization and list-based batching.
  • ParamsContainer: Handles Params[Payload] for multi-parameter runner methods. Delegates to AutoContainer for actual data handling.
  • PayloadContainer: Handles raw Payload objects by delegating to the container identified by the payload's container field.

DataContainerRegistry: A class-level registry that maps LazyType references to container classes. Populated at module load time by register_builtin_containers() which registers numpy, pandas, Triton, PIL, Params, and Payload types.

AutoContainer: The main entry point that dynamically dispatches to the appropriate container based on the runtime type of the data, using DataContainerRegistry.find_by_batch_type or find_by_name.

Usage

This module is used internally by BentoML's runner infrastructure to serialize data between the API server and runner workers, and to implement adaptive batching. Users typically do not interact with it directly.

Code Reference

Source Location

Signature

class Payload(t.NamedTuple):
    data: bytes
    meta: dict[str, bool | int | float | str | list[int]]
    container: str
    batch_size: int = -1

class DataContainer(t.Generic[SingleType, BatchType]):
    @classmethod
    def to_payload(cls, batch: BatchType, batch_dim: int) -> Payload: ...
    @classmethod
    def from_payload(cls, payload: Payload) -> BatchType: ...
    @classmethod
    def batches_to_batch(cls, batches: t.Sequence[SingleType], batch_dim: int) -> tuple[BatchType, list[int]]: ...
    @classmethod
    def batch_to_batches(cls, batch: BatchType, indices: t.Sequence[int], batch_dim: int) -> list[SingleType]: ...

class NdarrayContainer(DataContainer["ext.NpNDArray", "ext.NpNDArray"]): ...
class PandasDataFrameContainer(DataContainer[t.Union["ext.PdDataFrame", "ext.PdSeries"], "ext.PdDataFrame"]): ...
class PILImageContainer(DataContainer["ext.PILImage", "ext.PILImage"]): ...
class DefaultContainer(DataContainer[t.Any, t.List[t.Any]]): ...
class AutoContainer(DataContainer[t.Any, t.Any]): ...

class DataContainerRegistry:
    @classmethod
    def register_container(cls, single_type, batch_type, container_cls): ...
    @classmethod
    def find_by_single_type(cls, type_) -> t.Type[DataContainer]: ...
    @classmethod
    def find_by_batch_type(cls, type_) -> t.Type[DataContainer]: ...
    @classmethod
    def find_by_name(cls, name: str) -> t.Type[DataContainer]: ...

Import

from bentoml._internal.runner.container import AutoContainer
from bentoml._internal.runner.container import Payload
from bentoml._internal.runner.container import DataContainerRegistry

I/O Contract

Inputs

Name Type Required Description
batch BatchType Yes The data to serialize (numpy array, pandas DataFrame, PIL Image, or any Python object).
batch_dim int Yes The dimension along which batching occurs (typically 0).

Outputs

Name Type Description
Payload Payload A NamedTuple containing serialized bytes, metadata, container name, and batch size.
(batch, indices) tuple[BatchType, list[int]] When merging batches, returns the combined batch and index boundaries.

Usage Examples

import numpy as np
from bentoml._internal.runner.container import AutoContainer, Payload

# Serialize a numpy array to a Payload
arr = np.array([[1.0, 2.0], [3.0, 4.0]])
payload = AutoContainer.to_payload(arr, batch_dim=0)

# Deserialize back
restored = AutoContainer.from_payload(payload)

# Batch multiple payloads together
payload1 = AutoContainer.to_payload(np.array([[1, 2]]), batch_dim=0)
payload2 = AutoContainer.to_payload(np.array([[3, 4]]), batch_dim=0)
combined, indices = AutoContainer.from_batch_payloads([payload1, payload2], batch_dim=0)
# combined: array([[1, 2], [3, 4]])
# indices: [0, 1, 2]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment