Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML Deployment Create

From Leeroopedia

Overview

Deployment Create implements the Principle:Bentoml_BentoML_Cloud_Deployment_Creation principle by providing the bentoml.deployment.create() function that deploys a Bento artifact to BentoCloud as a running inference endpoint.

API

bentoml.deployment.create()

Source

src/bentoml/deployment.py:L68-130

Import

import bentoml

Signature

def create(
    name: str = None,
    path_context: str = None,
    *,
    bento: Tag | str = None,
    cluster: str = None,
    access_authorization: bool = None,
    scaling_min: int = None,
    scaling_max: int = None,
    instance_type: str = None,
    strategy: str = None,
    envs: list = None,
    labels: list = None,
    secrets: list[str] = None,
    extras: dict = None,
    config_dict: dict = None,
    config_file: str = None,
    args: dict = None,
) -> Deployment

Key Parameters

Parameter Type Default Description
name str None Unique deployment name (auto-generated if not provided)
bento str None Bento tag to deploy (e.g., "my_service:latest")
cluster str None Target cluster for deployment
scaling_min int None Minimum replicas (0 for scale-to-zero)
scaling_max int None Maximum replicas for auto-scaling
instance_type str None Compute instance type (e.g., "gpu.a10.1")
config_file str None Path to YAML config file
envs list None Environment variables
secrets list[str] None Named secrets from BentoCloud
args dict None Additional arguments passed to the deployment

Inputs and Outputs

Inputs:

  • Built Bento tag (local or already pushed to BentoCloud)
  • Deployment configuration parameters (inline or via config file)

Outputs:

  • Deployment object with the following key attributes:
    • name - Unique deployment identifier
    • admin_console - URL to the BentoCloud dashboard
    • cluster - Target cluster name
    • status - Current deployment status

Usage Examples

Minimal Deployment

import bentoml

# Deploy with minimal configuration
deployment = bentoml.deployment.create(
    bento="my_service:latest",
)
print(f"Deployed: {deployment.name}")
print(f"Console: {deployment.admin_console}")

Full Configuration

import bentoml

deployment = bentoml.deployment.create(
    name="my-llm-service",
    bento="llm_service:v2",
    cluster="gcp-us-central1",
    access_authorization=True,
    scaling_min=1,
    scaling_max=5,
    instance_type="gpu.a10.1",
    strategy="RollingUpdate",
    envs=[{"name": "MODEL_ID", "value": "meta-llama/Llama-3-8B"}],
    secrets=["hf-token"],
    labels=[{"key": "team", "value": "ml-platform"}],
)

Using Config File

import bentoml

deployment = bentoml.deployment.create(
    config_file="deployment.yaml",
)

CLI Usage

# Deploy from CLI
bentoml deploy my_service:latest --name my-deployment --scaling-min 1 --scaling-max 5

# Deploy using config file
bentoml deploy --config deployment.yaml

Metadata

Property Value
Implementation Deployment Create
API bentoml.deployment.create()
Source src/bentoml/deployment.py:L68-130
Domain ML_Serving, Cloud_Deployment
Workflow BentoCloud_Deployment
Principle Principle:Bentoml_BentoML_Cloud_Deployment_Creation

Knowledge Sources

2026-02-13 15:00 GMT

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment