Principle:Bentoml BentoML Service Dependency Injection

Overview

Service Dependency Injection is the mechanism by which multiple BentoML services are wired together without hard-coding their composition. Services declare their dependencies as class attributes, and the framework resolves them at runtime -- either as in-process proxies (local mode) or as inter-process RPC clients (distributed mode).

Detailed Explanation

Dependency injection (DI) is a well-established software design pattern that decouples the use of a dependency from its construction. In the context of BentoML's multi-model composition, DI provides several critical benefits:

Core Concepts

Declaration over Construction: Services declare what they depend on, not how to create or connect to it. The framework handles instantiation, process management, and communication.

Runtime Resolution: Dependencies are resolved at runtime based on the deployment context. In local development, a dependency may be an in-process method call. In production, the same declaration transparently becomes an RPC call to a separate process or remote service.

Transparent Proxying: The dependency descriptor returns a proxy object that mirrors the dependent service's API. Callers interact with the proxy exactly as they would with the real service, regardless of whether the service is in-process or remote.

How It Works

The dependency injection lifecycle follows these stages:

Stage	Description
Declaration	A service class attribute is assigned via `bentoml.depends(ServiceClass)`. This creates a `Dependency[T]` descriptor.
Discovery	When the parent service is loaded, the framework inspects its class attributes to find all `Dependency` descriptors. These are collected into the service's `dependencies` dict.
Resolution	At serve time, the framework decides how to resolve each dependency: in-process (development), inter-process (local distributed), or remote (cloud deployment).
Proxying	The descriptor's `__get__` method returns a proxy. When the proxy's methods are called, they delegate to the resolved service instance.

Benefits for Multi-Model Systems

Testability: Services can be tested in isolation by mocking their dependencies. The DI pattern makes it trivial to substitute a real model service with a mock.
Deployment Flexibility: The same composition code works across local development, staging, and production. Only the resolution strategy changes.
Independent Deployment: Each service in the dependency graph can be deployed and scaled independently. The DI framework handles service discovery and routing.
Type Safety: The Dependency[T] generic type preserves the type signature of the dependent service, enabling IDE autocompletion and static type checking.

Local vs. Distributed Resolution

Mode	Resolution Strategy	Communication
Local (dev)	In-process proxy	Direct method call
Local distributed	Inter-process proxy	Unix domain sockets / TCP
Cloud distributed	Remote RPC client	HTTP/gRPC over network

In all cases, the calling code in the parent service remains identical. The framework transparently handles serialization, transport, and deserialization.

Relationship to Implementation

This principle is implemented by the bentoml.depends() function, which returns a Dependency[T] descriptor that resolves at runtime.

Implementation:Bentoml_BentoML_Depends_Function

Metadata

Knowledge Sources

2026-02-13 15:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment