Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Bentoml BentoML ServiceConfig

From Leeroopedia
Knowledge Sources
Domains Configuration, Service
Last Updated 2026-02-13 15:00 GMT

Overview

Defines the complete TypedDict-based configuration schema for BentoML services, covering traffic, resources, HTTP/gRPC, SSL, tracing, monitoring, metrics, and logging settings.

Description

The config module provides a comprehensive, strongly-typed configuration schema for BentoML services using Python's TypedDict with Pydantic and annotated-types validation. The top-level ServiceConfig TypedDict aggregates the following sub-schemas:

  • TrafficSchema -- Timeout, max concurrency, concurrency capacity, and external queue toggle for BentoCloud.
  • ResourceSchema -- CPU (cores or millicores), memory (Gi or string with unit), GPU count, GPU type (NVIDIA and AMD literals), and TPU type literals.
  • WorkerSchema -- Number of workers as a positive integer or "cpu_count" literal.
  • MetricSchema / MetricDuration -- Prometheus metrics configuration with custom histogram buckets.
  • AccessLoggingSchema -- Access log filtering, content type/length logging, and trace/span ID format.
  • SSLSchema -- TLS certificate, key, CA, and cipher configuration.
  • HTTPSchema / HTTPCorsSchema -- HTTP host, port, proxy port, CORS settings (origins, methods, headers, credentials).
  • GRPCSchema -- gRPC host, port, concurrent streams, reflection, channelz, and max message length.
  • TracingSchema -- OpenTelemetry tracing with Zipkin, Jaeger, and OTLP exporter configurations.
  • MonitoringSchema -- Monitoring type and options.
  • LoggingSchema -- Wraps access logging configuration.
  • EndpointsSchema -- Custom paths for liveness and readiness probes.

The module also exposes a validate function that uses Pydantic's TypeAdapter for runtime validation of configuration dictionaries.

Usage

Use this module to define, validate, and type-check BentoML service configuration, typically passed to the @bentoml.service() decorator or loaded from configuration files.

Code Reference

Source Location

Signature

class ServiceConfig(TypedDict, total=False):
    traffic: TrafficSchema
    extra_ports: list[int]
    backlog: Annotated[int, Ge(64)]
    max_runner_connections: Posint
    runner_connection: RunnerConnectionSchema
    resources: ResourceSchema
    workers: WorkerSchema
    replicate_process: bool
    threads: Posint
    metrics: MetricSchema
    logging: LoggingSchema
    ssl: SSLSchema
    http: HTTPSchema
    grpc: GRPCSchema
    runner_probe: RunnerProbeSchema
    tracing: TracingSchema
    monitoring: MonitoringSchema
    endpoints: EndpointsSchema

schema_type = TypeAdapter(ServiceConfig)

def validate(data: ServiceConfig) -> ServiceConfig: ...

Import

from _bentoml_sdk.service.config import ServiceConfig, validate

I/O Contract

Inputs

Name Type Required Description
data ServiceConfig Yes A dictionary conforming to the ServiceConfig TypedDict schema

Outputs

Name Type Description
ServiceConfig ServiceConfig The validated configuration dictionary (from validate function)

Usage Examples

from _bentoml_sdk.service.config import ServiceConfig, validate

# Define a service configuration
config: ServiceConfig = {
    "traffic": {
        "timeout": 60.0,
        "max_concurrency": 100,
    },
    "resources": {
        "cpu": 4,
        "memory": "8Gi",
        "gpu": 1,
        "gpu_type": "nvidia-tesla-a100",
    },
    "workers": 2,
    "http": {
        "port": 3000,
        "cors": {
            "enabled": True,
            "access_control_allow_origins": ["http://localhost:3000"],
        },
    },
}

# Validate configuration at runtime
validated = validate(config)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment