Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Workflow:Bentoml BentoML Bento Build And Containerization

From Leeroopedia
Knowledge Sources
Domains ML_Serving, Containerization, DevOps
Last Updated 2026-02-13 15:00 GMT

Overview

End-to-end process for packaging a BentoML Service into a standardized Bento artifact and containerizing it as a Docker image for deployment.

Description

This workflow covers the packaging and containerization pipeline that transforms a BentoML Service into a portable, reproducible deployment artifact. The process begins with defining the runtime environment (Python version, dependencies, system packages) using the Image configuration, then uses bentoml build to package source code, dependencies, model references, and configuration into a versioned Bento archive. The Bento is then containerized into an OCI-compliant Docker image via bentoml containerize, ready for deployment in any Docker-compatible environment.

Key capabilities covered:

  • Runtime environment definition using the Image class
  • Bento build configuration via bentofile.yaml or the Image API
  • Bento artifact creation with automatic versioning
  • Docker image generation with multiple backend support (Docker, Buildx, Buildah, BuildKit)
  • Bento management (list, export, import)

Usage

Execute this workflow when you have a working BentoML Service (tested locally via bentoml serve) and need to package it for deployment to production environments. This is required before deploying to Docker/Kubernetes environments. For BentoCloud deployment, bentoml deploy handles building automatically.

Execution Steps

Step 1: Define Runtime Environment

Configure the runtime environment that the Bento will use when deployed. This is done through the Image class in the @bentoml.service decorator, which specifies the Python version, system packages, Python dependencies, and custom commands. Alternatively, configure the environment through a bentofile.yaml file with docker and python sections.

Key considerations:

  • The Image class supports method chaining: Image(python_version="3.11").python_packages("torch")
  • Use .requirements_file("requirements.txt") to reference an existing requirements file
  • System packages can be installed via .system_packages()
  • Custom shell commands can be run via .run()
  • Dependencies are automatically locked for reproducibility

Step 2: Configure Build Settings

Define which files to include in the Bento, along with labels, description, and environment variables. By default, all files in the working directory are included. Use a .bentoignore file (similar to .gitignore) to exclude files. Build settings can be specified in the @bentoml.service decorator or in a bentofile.yaml file.

Key considerations:

  • The bentofile.yaml provides a declarative build configuration
  • The .bentoignore file excludes unnecessary files from the Bento
  • Environment variables can be specified with or without default values
  • Labels help organize and identify Bentos
  • Model references declared as class-level attributes are automatically included

Step 3: Build the Bento

Run bentoml build from the project directory to create the Bento artifact. This packages the service source code, Python dependencies configuration, model references, and metadata into a versioned archive stored in the local Bento Store. Each Bento receives a unique auto-generated version tag.

Key considerations:

  • The build process validates the service definition
  • Each Bento is immutably versioned with a unique tag (name:version)
  • The Bento Store is located at ~/bentoml/bentos/ by default
  • Use bentoml list to see all built Bentos
  • Use bentoml get <tag> to inspect a specific Bento

Step 4: Containerize the Bento

Run bentoml containerize <bento_tag> to generate an OCI-compliant Docker image from the Bento. The containerization process generates a Dockerfile from the Bento's configuration and builds the image using the specified backend (Docker by default). The resulting image contains all code, dependencies, and configuration needed to run the service.

Key considerations:

  • Requires Docker (or an alternative OCI builder) to be installed and running
  • The Docker image tag matches the Bento tag by default
  • Use --platform flag for cross-platform builds (e.g., linux/amd64 on Apple Silicon)
  • Multiple backend options: Docker, Buildx, Buildah, BuildKit, Podman, nerdctl
  • Custom image tags can be specified with --image-tag

Step 5: Run the Container

Launch the containerized service using docker run with appropriate port mapping and GPU configuration. The container runs the same production-grade server that bentoml serve provides, with all the same endpoints, health checks, and observability features.

Key considerations:

  • Map port 3000 (default BentoML server port) to the host
  • Use --gpus all for GPU-enabled services
  • The container exposes the same REST API as local serving
  • Environment variables can be passed via docker run -e
  • For production, consider resource limits and health check configuration

Step 6: Manage Bento Artifacts

Use BentoML CLI commands or Python APIs to manage Bento artifacts in the local store. Bentos can be exported as standalone archives for sharing, imported from external sources, or pushed to/pulled from BentoCloud for team collaboration.

Key considerations:

  • bentoml export creates a standalone .bento archive file
  • bentoml import loads a .bento archive into the local store
  • External storage (S3, GCS) is supported for export/import
  • bentoml push/pull syncs with BentoCloud registry
  • bentoml delete removes Bentos from the local store

Execution Diagram

GitHub URL

Workflow Repository