Workflow:Bentoml BentoML Bento Build And Containerization
| Knowledge Sources | |
|---|---|
| Domains | ML_Serving, Containerization, DevOps |
| Last Updated | 2026-02-13 15:00 GMT |
Overview
End-to-end process for packaging a BentoML Service into a standardized Bento artifact and containerizing it as a Docker image for deployment.
Description
This workflow covers the packaging and containerization pipeline that transforms a BentoML Service into a portable, reproducible deployment artifact. The process begins with defining the runtime environment (Python version, dependencies, system packages) using the Image configuration, then uses bentoml build to package source code, dependencies, model references, and configuration into a versioned Bento archive. The Bento is then containerized into an OCI-compliant Docker image via bentoml containerize, ready for deployment in any Docker-compatible environment.
Key capabilities covered:
- Runtime environment definition using the Image class
- Bento build configuration via bentofile.yaml or the Image API
- Bento artifact creation with automatic versioning
- Docker image generation with multiple backend support (Docker, Buildx, Buildah, BuildKit)
- Bento management (list, export, import)
Usage
Execute this workflow when you have a working BentoML Service (tested locally via bentoml serve) and need to package it for deployment to production environments. This is required before deploying to Docker/Kubernetes environments. For BentoCloud deployment, bentoml deploy handles building automatically.
Execution Steps
Step 1: Define Runtime Environment
Configure the runtime environment that the Bento will use when deployed. This is done through the Image class in the @bentoml.service decorator, which specifies the Python version, system packages, Python dependencies, and custom commands. Alternatively, configure the environment through a bentofile.yaml file with docker and python sections.
Key considerations:
- The Image class supports method chaining: Image(python_version="3.11").python_packages("torch")
- Use .requirements_file("requirements.txt") to reference an existing requirements file
- System packages can be installed via .system_packages()
- Custom shell commands can be run via .run()
- Dependencies are automatically locked for reproducibility
Step 2: Configure Build Settings
Define which files to include in the Bento, along with labels, description, and environment variables. By default, all files in the working directory are included. Use a .bentoignore file (similar to .gitignore) to exclude files. Build settings can be specified in the @bentoml.service decorator or in a bentofile.yaml file.
Key considerations:
- The bentofile.yaml provides a declarative build configuration
- The .bentoignore file excludes unnecessary files from the Bento
- Environment variables can be specified with or without default values
- Labels help organize and identify Bentos
- Model references declared as class-level attributes are automatically included
Step 3: Build the Bento
Run bentoml build from the project directory to create the Bento artifact. This packages the service source code, Python dependencies configuration, model references, and metadata into a versioned archive stored in the local Bento Store. Each Bento receives a unique auto-generated version tag.
Key considerations:
- The build process validates the service definition
- Each Bento is immutably versioned with a unique tag (name:version)
- The Bento Store is located at ~/bentoml/bentos/ by default
- Use bentoml list to see all built Bentos
- Use bentoml get <tag> to inspect a specific Bento
Step 4: Containerize the Bento
Run bentoml containerize <bento_tag> to generate an OCI-compliant Docker image from the Bento. The containerization process generates a Dockerfile from the Bento's configuration and builds the image using the specified backend (Docker by default). The resulting image contains all code, dependencies, and configuration needed to run the service.
Key considerations:
- Requires Docker (or an alternative OCI builder) to be installed and running
- The Docker image tag matches the Bento tag by default
- Use --platform flag for cross-platform builds (e.g., linux/amd64 on Apple Silicon)
- Multiple backend options: Docker, Buildx, Buildah, BuildKit, Podman, nerdctl
- Custom image tags can be specified with --image-tag
Step 5: Run the Container
Launch the containerized service using docker run with appropriate port mapping and GPU configuration. The container runs the same production-grade server that bentoml serve provides, with all the same endpoints, health checks, and observability features.
Key considerations:
- Map port 3000 (default BentoML server port) to the host
- Use --gpus all for GPU-enabled services
- The container exposes the same REST API as local serving
- Environment variables can be passed via docker run -e
- For production, consider resource limits and health check configuration
Step 6: Manage Bento Artifacts
Use BentoML CLI commands or Python APIs to manage Bento artifacts in the local store. Bentos can be exported as standalone archives for sharing, imported from external sources, or pushed to/pulled from BentoCloud for team collaboration.
Key considerations:
- bentoml export creates a standalone .bento archive file
- bentoml import loads a .bento archive into the local store
- External storage (S3, GCS) is supported for export/import
- bentoml push/pull syncs with BentoCloud registry
- bentoml delete removes Bentos from the local store