Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bentoml BentoML Framework TensorFlow

From Leeroopedia
Knowledge Sources
Domains ML Framework, Deep Learning, Model Serialization
Last Updated 2026-02-13 15:00 GMT

Overview

The bentoml.tensorflow module provides BentoML integration for TensorFlow SavedModel format, enabling save, load, and serving of TensorFlow modules and Keras models as SavedModels.

Description

This module implements the BentoML framework adapter for TensorFlow models using the tf.saved_model API. It saves models in the TensorFlow SavedModel format and loads them back as AutoTrackable objects.

Key implementation details:

  • save_model(): Saves a tf.Module or Keras model using tf.saved_model.save(). Supports TensorFlow-specific signatures and save options. Automatically infers default method signatures from restorable functions or defaults to __call__.
  • load_model(): Loads a SavedModel using tf.saved_model.load() with device placement. Handles GPU memory growth optimization.
  • get_runnable(): Creates a TensorflowRunnable that:
    • Auto-selects GPU/CPU devices based on available hardware.
    • Manages a TensorFlow device context via an ExitStack.
    • Generates run methods with lazy initialization and caching.
    • Performs output post-processing: single-output models return NumPy arrays directly; multi-output models return tuples.
    • Handles automatic type casting with deferred error recovery when tf.function signatures are present.

The module also registers a TensorflowTensorContainer for batching/unbatching TensorFlow tensors in the BentoML Runner data container system, supporting batch concatenation, splitting, and pickle-based payload serialization.

Usage

Use this module to save and serve TensorFlow models (tf.Module, Keras models as SavedModel, custom tf.function modules) within BentoML services. Also supports RaggedTensors.

Code Reference

Source Location

Signature

def get(tag_like: str | Tag) -> bentoml.Model: ...

def load_model(bento_model: str | Tag | bentoml.Model,
               device_name: str = "/device:CPU:0"
               ) -> tf_ext.AutoTrackable | tf_ext.Module: ...

def save_model(name: Tag | str,
               model: tf_ext.KerasModel | tf_ext.Module,
               *, tf_signatures: tf_ext.ConcreteFunction | None = None,
               tf_save_options: tf_ext.SaveOptions | None = None,
               signatures: dict | None = None,
               labels: dict[str, str] | None = None,
               custom_objects: dict[str, Any] | None = None,
               external_modules: list[ModuleType] | None = None,
               metadata: dict[str, Any] | None = None
               ) -> bentoml.Model: ...

def get_runnable(bento_model: bentoml.Model) -> type[Runnable]: ...

class TensorflowTensorContainer(DataContainer):
    # Batching/unbatching for TF tensors in Runner payloads
    ...

Import

import bentoml

# Via public API
model = bentoml.tensorflow.save_model(...)
loaded = bentoml.tensorflow.load_model(...)

I/O Contract

Inputs

save_model()

Name Type Required Description
name Tag or str Yes Name/tag for the model in the BentoML store
model tf.Module or keras.Model Yes The TensorFlow module or Keras model to save
tf_signatures ConcreteFunction or None No TensorFlow signatures for the SavedModel
tf_save_options SaveOptions or None No TensorFlow save options
signatures dict or None No BentoML inference method signatures (auto-inferred from restorable functions or defaults to {"__call__": {"batchable": False}})
labels dict[str, str] or None No User-defined labels for model management
custom_objects dict[str, Any] or None No Additional objects to serialize
external_modules list[ModuleType] or None No Additional Python modules to save alongside
metadata dict[str, Any] or None No Custom metadata for the model

load_model()

Name Type Required Description
bento_model str, Tag, or Model Yes Tag or Model instance to load from the store
device_name str No TensorFlow device string (default: "/device:CPU:0")

Outputs

Method Return Type Description
save_model() bentoml.Model A BentoML Model containing the TensorFlow SavedModel
load_model() AutoTrackable or tf.Module The loaded TensorFlow model
get() bentoml.Model The BentoML Model reference from the store
get_runnable() type[Runnable] A TensorflowRunnable class with device management and type casting

Usage Examples

import bentoml
import tensorflow as tf
import numpy as np

# Define a TensorFlow module
class NativeModel(tf.Module):
    def __init__(self):
        super().__init__()
        self.weights = np.asfarray([[1.0], [1.0], [1.0], [1.0], [1.0]])
        self.dense = lambda inputs: tf.matmul(inputs, self.weights)

    @tf.function(
        input_signature=[tf.TensorSpec(shape=[1, 5], dtype=tf.float64, name="inputs")]
    )
    def __call__(self, inputs):
        return self.dense(inputs)

# Save the model
model = NativeModel()
bento_model = bentoml.tensorflow.save_model("native_toy", model)

# Load and run inference
loaded = bentoml.tensorflow.load_model("native_toy:latest")
result = loaded(tf.constant([[1.0, 2.0, 3.0, 4.0, 5.0]]))

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment