Implementation:Bentoml BentoML Framework TensorFlow
| Knowledge Sources | |
|---|---|
| Domains | ML Framework, Deep Learning, Model Serialization |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
The bentoml.tensorflow module provides BentoML integration for TensorFlow SavedModel format, enabling save, load, and serving of TensorFlow modules and Keras models as SavedModels.
Description
This module implements the BentoML framework adapter for TensorFlow models using the tf.saved_model API. It saves models in the TensorFlow SavedModel format and loads them back as AutoTrackable objects.
Key implementation details:
- save_model(): Saves a
tf.Moduleor Keras model usingtf.saved_model.save(). Supports TensorFlow-specific signatures and save options. Automatically infers default method signatures from restorable functions or defaults to__call__. - load_model(): Loads a SavedModel using
tf.saved_model.load()with device placement. Handles GPU memory growth optimization. - get_runnable(): Creates a
TensorflowRunnablethat:- Auto-selects GPU/CPU devices based on available hardware.
- Manages a TensorFlow device context via an
ExitStack. - Generates run methods with lazy initialization and caching.
- Performs output post-processing: single-output models return NumPy arrays directly; multi-output models return tuples.
- Handles automatic type casting with deferred error recovery when
tf.functionsignatures are present.
The module also registers a TensorflowTensorContainer for batching/unbatching TensorFlow tensors in the BentoML Runner data container system, supporting batch concatenation, splitting, and pickle-based payload serialization.
Usage
Use this module to save and serve TensorFlow models (tf.Module, Keras models as SavedModel, custom tf.function modules) within BentoML services. Also supports RaggedTensors.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/frameworks/tensorflow.py
- Lines: 1-417
Signature
def get(tag_like: str | Tag) -> bentoml.Model: ...
def load_model(bento_model: str | Tag | bentoml.Model,
device_name: str = "/device:CPU:0"
) -> tf_ext.AutoTrackable | tf_ext.Module: ...
def save_model(name: Tag | str,
model: tf_ext.KerasModel | tf_ext.Module,
*, tf_signatures: tf_ext.ConcreteFunction | None = None,
tf_save_options: tf_ext.SaveOptions | None = None,
signatures: dict | None = None,
labels: dict[str, str] | None = None,
custom_objects: dict[str, Any] | None = None,
external_modules: list[ModuleType] | None = None,
metadata: dict[str, Any] | None = None
) -> bentoml.Model: ...
def get_runnable(bento_model: bentoml.Model) -> type[Runnable]: ...
class TensorflowTensorContainer(DataContainer):
# Batching/unbatching for TF tensors in Runner payloads
...
Import
import bentoml
# Via public API
model = bentoml.tensorflow.save_model(...)
loaded = bentoml.tensorflow.load_model(...)
I/O Contract
Inputs
save_model()
| Name | Type | Required | Description |
|---|---|---|---|
| name | Tag or str | Yes | Name/tag for the model in the BentoML store |
| model | tf.Module or keras.Model | Yes | The TensorFlow module or Keras model to save |
| tf_signatures | ConcreteFunction or None | No | TensorFlow signatures for the SavedModel |
| tf_save_options | SaveOptions or None | No | TensorFlow save options |
| signatures | dict or None | No | BentoML inference method signatures (auto-inferred from restorable functions or defaults to {"__call__": {"batchable": False}}) |
| labels | dict[str, str] or None | No | User-defined labels for model management |
| custom_objects | dict[str, Any] or None | No | Additional objects to serialize |
| external_modules | list[ModuleType] or None | No | Additional Python modules to save alongside |
| metadata | dict[str, Any] or None | No | Custom metadata for the model |
load_model()
| Name | Type | Required | Description |
|---|---|---|---|
| bento_model | str, Tag, or Model | Yes | Tag or Model instance to load from the store |
| device_name | str | No | TensorFlow device string (default: "/device:CPU:0") |
Outputs
| Method | Return Type | Description |
|---|---|---|
| save_model() | bentoml.Model | A BentoML Model containing the TensorFlow SavedModel |
| load_model() | AutoTrackable or tf.Module | The loaded TensorFlow model |
| get() | bentoml.Model | The BentoML Model reference from the store |
| get_runnable() | type[Runnable] | A TensorflowRunnable class with device management and type casting |
Usage Examples
import bentoml
import tensorflow as tf
import numpy as np
# Define a TensorFlow module
class NativeModel(tf.Module):
def __init__(self):
super().__init__()
self.weights = np.asfarray([[1.0], [1.0], [1.0], [1.0], [1.0]])
self.dense = lambda inputs: tf.matmul(inputs, self.weights)
@tf.function(
input_signature=[tf.TensorSpec(shape=[1, 5], dtype=tf.float64, name="inputs")]
)
def __call__(self, inputs):
return self.dense(inputs)
# Save the model
model = NativeModel()
bento_model = bentoml.tensorflow.save_model("native_toy", model)
# Load and run inference
loaded = bentoml.tensorflow.load_model("native_toy:latest")
result = loaded(tf.constant([[1.0, 2.0, 3.0, 4.0, 5.0]]))