Implementation:Bentoml BentoML Framework TensorFlow

Knowledge Sources	Bentoml_BentoML
Domains	ML Framework, Deep Learning, Model Serialization
Last Updated	2026-02-13 15:00 GMT

Overview

The bentoml.tensorflow module provides BentoML integration for TensorFlow SavedModel format, enabling save, load, and serving of TensorFlow modules and Keras models as SavedModels.

Description

This module implements the BentoML framework adapter for TensorFlow models using the tf.saved_model API. It saves models in the TensorFlow SavedModel format and loads them back as AutoTrackable objects.

Key implementation details:

save_model(): Saves a tf.Module or Keras model using tf.saved_model.save(). Supports TensorFlow-specific signatures and save options. Automatically infers default method signatures from restorable functions or defaults to __call__.
load_model(): Loads a SavedModel using tf.saved_model.load() with device placement. Handles GPU memory growth optimization.
get_runnable(): Creates a TensorflowRunnable that:
- Auto-selects GPU/CPU devices based on available hardware.
- Manages a TensorFlow device context via an ExitStack.
- Generates run methods with lazy initialization and caching.
- Performs output post-processing: single-output models return NumPy arrays directly; multi-output models return tuples.
- Handles automatic type casting with deferred error recovery when tf.function signatures are present.

The module also registers a TensorflowTensorContainer for batching/unbatching TensorFlow tensors in the BentoML Runner data container system, supporting batch concatenation, splitting, and pickle-based payload serialization.

Usage

Use this module to save and serve TensorFlow models (tf.Module, Keras models as SavedModel, custom tf.function modules) within BentoML services. Also supports RaggedTensors.

Code Reference

Source Location

Repository: Bentoml_BentoML
File: src/bentoml/_internal/frameworks/tensorflow.py
Lines: 1-417

Signature

def get(tag_like: str | Tag) -> bentoml.Model: ...

def load_model(bento_model: str | Tag | bentoml.Model,
               device_name: str = "/device:CPU:0"
               ) -> tf_ext.AutoTrackable | tf_ext.Module: ...

def save_model(name: Tag | str,
               model: tf_ext.KerasModel | tf_ext.Module,
               *, tf_signatures: tf_ext.ConcreteFunction | None = None,
               tf_save_options: tf_ext.SaveOptions | None = None,
               signatures: dict | None = None,
               labels: dict[str, str] | None = None,
               custom_objects: dict[str, Any] | None = None,
               external_modules: list[ModuleType] | None = None,
               metadata: dict[str, Any] | None = None
               ) -> bentoml.Model: ...

def get_runnable(bento_model: bentoml.Model) -> type[Runnable]: ...

class TensorflowTensorContainer(DataContainer):
    # Batching/unbatching for TF tensors in Runner payloads
    ...

Import

import bentoml

# Via public API
model = bentoml.tensorflow.save_model(...)
loaded = bentoml.tensorflow.load_model(...)

I/O Contract

Inputs

save_model()

Name	Type	Required	Description
name	Tag or str	Yes	Name/tag for the model in the BentoML store
model	tf.Module or keras.Model	Yes	The TensorFlow module or Keras model to save
tf_signatures	ConcreteFunction or None	No	TensorFlow signatures for the SavedModel
tf_save_options	SaveOptions or None	No	TensorFlow save options
signatures	dict or None	No	BentoML inference method signatures (auto-inferred from restorable functions or defaults to {"__call__": {"batchable": False}})
labels	dict[str, str] or None	No	User-defined labels for model management
custom_objects	dict[str, Any] or None	No	Additional objects to serialize
external_modules	list[ModuleType] or None	No	Additional Python modules to save alongside
metadata	dict[str, Any] or None	No	Custom metadata for the model

load_model()

Name	Type	Required	Description
bento_model	str, Tag, or Model	Yes	Tag or Model instance to load from the store
device_name	str	No	TensorFlow device string (default: "/device:CPU:0")

Outputs

Method	Return Type	Description
save_model()	bentoml.Model	A BentoML Model containing the TensorFlow SavedModel
load_model()	AutoTrackable or tf.Module	The loaded TensorFlow model
get()	bentoml.Model	The BentoML Model reference from the store
get_runnable()	type[Runnable]	A TensorflowRunnable class with device management and type casting

Usage Examples

import bentoml
import tensorflow as tf
import numpy as np

# Define a TensorFlow module
class NativeModel(tf.Module):
    def __init__(self):
        super().__init__()
        self.weights = np.asfarray([[1.0], [1.0], [1.0], [1.0], [1.0]])
        self.dense = lambda inputs: tf.matmul(inputs, self.weights)

    @tf.function(
        input_signature=[tf.TensorSpec(shape=[1, 5], dtype=tf.float64, name="inputs")]
    )
    def __call__(self, inputs):
        return self.dense(inputs)

# Save the model
model = NativeModel()
bento_model = bentoml.tensorflow.save_model("native_toy", model)

# Load and run inference
loaded = bentoml.tensorflow.load_model("native_toy:latest")
result = loaded(tf.constant([[1.0, 2.0, 3.0, 4.0, 5.0]]))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment