Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Bentoml BentoML Framework Keras

From Leeroopedia
Knowledge Sources
Domains ML Framework, Deep Learning, Model Serialization
Last Updated 2026-02-13 15:00 GMT

Overview

The bentoml.keras module provides BentoML integration for Keras models (backed by TensorFlow), enabling save, load, and serving of Keras models and Sequential models.

Description

This module implements the BentoML framework adapter for keras.Model and keras.Sequential models. It uses TensorFlow as the backend and saves models in the Keras native format.

ModelOptions (attrs dataclass) includes:

  • include_optimizer: Whether to save the optimizer state with the model (default: False).

Key implementation details:

  • save_model(): Saves the Keras model using model.save(). For Keras >= 3.4.0, uses the zipped=False parameter. Supports TensorFlow-specific signatures and save options.
  • load_model(): Loads the model using keras.models.load_model() with custom objects support. Handles GPU memory growth optimization when loading to GPU devices.
  • get_runnable(): Creates a KerasRunnable that auto-selects GPU/CPU devices, converts inputs to TensorFlow tensors, and converts output EagerTensors back to NumPy arrays. Uses a method cache for performance optimization.

The module validates that the provided model is an instance of keras.Model or keras.Sequential before saving.

Usage

Use this module to save and serve Keras deep learning models within BentoML services. Suitable for any Keras-based model including CNNs, RNNs, transformers, and custom architectures.

Code Reference

Source Location

Signature

def get(tag_like: str | Tag) -> bentoml.Model: ...

def load_model(bento_model: str | Tag | bentoml.Model,
               device_name: str = "/device:CPU:0") -> tf_ext.KerasModel: ...

def save_model(name: Tag | str,
               model: tf_ext.KerasModel,
               *, tf_signatures: tf_ext.ConcreteFunction | None = None,
               tf_save_options: tf_ext.SaveOptions | None = None,
               include_optimizer: bool = False,
               signatures: dict | None = None,
               labels: dict[str, str] | None = None,
               custom_objects: dict[str, Any] | None = None,
               external_modules: List[ModuleType] | None = None,
               metadata: dict[str, Any] | None = None) -> bentoml.Model: ...

def get_runnable(bento_model: bentoml.Model) -> type[Runnable]: ...

Import

import bentoml

# Via public API
model = bentoml.keras.save_model(...)
loaded = bentoml.keras.load_model(...)

I/O Contract

Inputs

save_model()

Name Type Required Description
name Tag or str Yes Name/tag for the model in the BentoML store
model keras.Model Yes The Keras model instance to save
tf_signatures ConcreteFunction or None No TensorFlow signatures for the saved model
tf_save_options SaveOptions or None No TensorFlow save options
include_optimizer bool No Whether to save optimizer state (default: False)
signatures dict or None No Inference method signatures (default: {"predict": {"batchable": False}})
labels dict[str, str] or None No User-defined labels for model management
custom_objects dict[str, Any] or None No Keras custom objects (layers, activations, etc.)
external_modules List[ModuleType] or None No Additional Python modules to save alongside
metadata dict[str, Any] or None No Custom metadata for the model

load_model()

Name Type Required Description
bento_model str, Tag, or Model Yes Tag or Model instance to load from the store
device_name str No TensorFlow device (default: "/device:CPU:0")

Outputs

Method Return Type Description
save_model() bentoml.Model A BentoML Model containing the saved Keras model
load_model() keras.Model The loaded Keras model instance
get() bentoml.Model The BentoML Model reference from the store
get_runnable() type[Runnable] A KerasRunnable class for BentoML Runner serving

Usage Examples

import bentoml
import tensorflow as tf
import keras

# Build a Keras model
model = keras.models.Sequential([
    keras.layers.Dense(units=1, input_shape=(5,), use_bias=False,
                       kernel_initializer=keras.initializers.Ones()),
])
opt = keras.optimizers.Adam(0.002, 0.5)
model.compile(optimizer=opt, loss="binary_crossentropy", metrics=["accuracy"])

# Save the model
bento_model = bentoml.keras.save_model("keras_model", model)

# Load back
loaded = bentoml.keras.load_model("keras_model:latest")

# Save with custom objects
custom_objects = {"CustomLayer": CustomLayer, "custom_activation": custom_activation}
bento_model = bentoml.keras.save_model(
    "custom_keras", model, custom_objects=custom_objects
)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment