Implementation:Bentoml BentoML ServiceConfig

Knowledge Sources	Bentoml_BentoML
Domains	Configuration, Service
Last Updated	2026-02-13 15:00 GMT

Overview

Defines the complete TypedDict-based configuration schema for BentoML services, covering traffic, resources, HTTP/gRPC, SSL, tracing, monitoring, metrics, and logging settings.

Description

The config module provides a comprehensive, strongly-typed configuration schema for BentoML services using Python's TypedDict with Pydantic and annotated-types validation. The top-level ServiceConfig TypedDict aggregates the following sub-schemas:

TrafficSchema -- Timeout, max concurrency, concurrency capacity, and external queue toggle for BentoCloud.
ResourceSchema -- CPU (cores or millicores), memory (Gi or string with unit), GPU count, GPU type (NVIDIA and AMD literals), and TPU type literals.
WorkerSchema -- Number of workers as a positive integer or "cpu_count" literal.
MetricSchema / MetricDuration -- Prometheus metrics configuration with custom histogram buckets.
AccessLoggingSchema -- Access log filtering, content type/length logging, and trace/span ID format.
SSLSchema -- TLS certificate, key, CA, and cipher configuration.
HTTPSchema / HTTPCorsSchema -- HTTP host, port, proxy port, CORS settings (origins, methods, headers, credentials).
GRPCSchema -- gRPC host, port, concurrent streams, reflection, channelz, and max message length.
TracingSchema -- OpenTelemetry tracing with Zipkin, Jaeger, and OTLP exporter configurations.
MonitoringSchema -- Monitoring type and options.
LoggingSchema -- Wraps access logging configuration.
EndpointsSchema -- Custom paths for liveness and readiness probes.

The module also exposes a validate function that uses Pydantic's TypeAdapter for runtime validation of configuration dictionaries.

Usage

Use this module to define, validate, and type-check BentoML service configuration, typically passed to the @bentoml.service() decorator or loaded from configuration files.

Code Reference

Source Location

Repository: Bentoml_BentoML
File: src/_bentoml_sdk/service/config.py
Lines: 1-285

Signature

class ServiceConfig(TypedDict, total=False):
    traffic: TrafficSchema
    extra_ports: list[int]
    backlog: Annotated[int, Ge(64)]
    max_runner_connections: Posint
    runner_connection: RunnerConnectionSchema
    resources: ResourceSchema
    workers: WorkerSchema
    replicate_process: bool
    threads: Posint
    metrics: MetricSchema
    logging: LoggingSchema
    ssl: SSLSchema
    http: HTTPSchema
    grpc: GRPCSchema
    runner_probe: RunnerProbeSchema
    tracing: TracingSchema
    monitoring: MonitoringSchema
    endpoints: EndpointsSchema

schema_type = TypeAdapter(ServiceConfig)

def validate(data: ServiceConfig) -> ServiceConfig: ...

Import

from _bentoml_sdk.service.config import ServiceConfig, validate

I/O Contract

Inputs

Name	Type	Required	Description
data	ServiceConfig	Yes	A dictionary conforming to the ServiceConfig TypedDict schema

Outputs

Name	Type	Description
ServiceConfig	ServiceConfig	The validated configuration dictionary (from validate function)

Usage Examples

from _bentoml_sdk.service.config import ServiceConfig, validate

# Define a service configuration
config: ServiceConfig = {
    "traffic": {
        "timeout": 60.0,
        "max_concurrency": 100,
    },
    "resources": {
        "cpu": 4,
        "memory": "8Gi",
        "gpu": 1,
        "gpu_type": "nvidia-tesla-a100",
    },
    "workers": 2,
    "http": {
        "port": 3000,
        "cors": {
            "enabled": True,
            "access_control_allow_origins": ["http://localhost:3000"],
        },
    },
}

# Validate configuration at runtime
validated = validate(config)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment