Implementation:Bentoml BentoML Framework Keras
| Knowledge Sources | |
|---|---|
| Domains | ML Framework, Deep Learning, Model Serialization |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
The bentoml.keras module provides BentoML integration for Keras models (backed by TensorFlow), enabling save, load, and serving of Keras models and Sequential models.
Description
This module implements the BentoML framework adapter for keras.Model and keras.Sequential models. It uses TensorFlow as the backend and saves models in the Keras native format.
ModelOptions (attrs dataclass) includes:
include_optimizer: Whether to save the optimizer state with the model (default: False).
Key implementation details:
- save_model(): Saves the Keras model using
model.save(). For Keras >= 3.4.0, uses thezipped=Falseparameter. Supports TensorFlow-specific signatures and save options. - load_model(): Loads the model using
keras.models.load_model()with custom objects support. Handles GPU memory growth optimization when loading to GPU devices. - get_runnable(): Creates a
KerasRunnablethat auto-selects GPU/CPU devices, converts inputs to TensorFlow tensors, and converts output EagerTensors back to NumPy arrays. Uses a method cache for performance optimization.
The module validates that the provided model is an instance of keras.Model or keras.Sequential before saving.
Usage
Use this module to save and serve Keras deep learning models within BentoML services. Suitable for any Keras-based model including CNNs, RNNs, transformers, and custom architectures.
Code Reference
Source Location
- Repository: Bentoml_BentoML
- File: src/bentoml/_internal/frameworks/keras.py
- Lines: 1-370
Signature
def get(tag_like: str | Tag) -> bentoml.Model: ...
def load_model(bento_model: str | Tag | bentoml.Model,
device_name: str = "/device:CPU:0") -> tf_ext.KerasModel: ...
def save_model(name: Tag | str,
model: tf_ext.KerasModel,
*, tf_signatures: tf_ext.ConcreteFunction | None = None,
tf_save_options: tf_ext.SaveOptions | None = None,
include_optimizer: bool = False,
signatures: dict | None = None,
labels: dict[str, str] | None = None,
custom_objects: dict[str, Any] | None = None,
external_modules: List[ModuleType] | None = None,
metadata: dict[str, Any] | None = None) -> bentoml.Model: ...
def get_runnable(bento_model: bentoml.Model) -> type[Runnable]: ...
Import
import bentoml
# Via public API
model = bentoml.keras.save_model(...)
loaded = bentoml.keras.load_model(...)
I/O Contract
Inputs
save_model()
| Name | Type | Required | Description |
|---|---|---|---|
| name | Tag or str | Yes | Name/tag for the model in the BentoML store |
| model | keras.Model | Yes | The Keras model instance to save |
| tf_signatures | ConcreteFunction or None | No | TensorFlow signatures for the saved model |
| tf_save_options | SaveOptions or None | No | TensorFlow save options |
| include_optimizer | bool | No | Whether to save optimizer state (default: False) |
| signatures | dict or None | No | Inference method signatures (default: {"predict": {"batchable": False}}) |
| labels | dict[str, str] or None | No | User-defined labels for model management |
| custom_objects | dict[str, Any] or None | No | Keras custom objects (layers, activations, etc.) |
| external_modules | List[ModuleType] or None | No | Additional Python modules to save alongside |
| metadata | dict[str, Any] or None | No | Custom metadata for the model |
load_model()
| Name | Type | Required | Description |
|---|---|---|---|
| bento_model | str, Tag, or Model | Yes | Tag or Model instance to load from the store |
| device_name | str | No | TensorFlow device (default: "/device:CPU:0") |
Outputs
| Method | Return Type | Description |
|---|---|---|
| save_model() | bentoml.Model | A BentoML Model containing the saved Keras model |
| load_model() | keras.Model | The loaded Keras model instance |
| get() | bentoml.Model | The BentoML Model reference from the store |
| get_runnable() | type[Runnable] | A KerasRunnable class for BentoML Runner serving |
Usage Examples
import bentoml
import tensorflow as tf
import keras
# Build a Keras model
model = keras.models.Sequential([
keras.layers.Dense(units=1, input_shape=(5,), use_bias=False,
kernel_initializer=keras.initializers.Ones()),
])
opt = keras.optimizers.Adam(0.002, 0.5)
model.compile(optimizer=opt, loss="binary_crossentropy", metrics=["accuracy"])
# Save the model
bento_model = bentoml.keras.save_model("keras_model", model)
# Load back
loaded = bentoml.keras.load_model("keras_model:latest")
# Save with custom objects
custom_objects = {"CustomLayer": CustomLayer, "custom_activation": custom_activation}
bento_model = bentoml.keras.save_model(
"custom_keras", model, custom_objects=custom_objects
)