Principle:Bentoml BentoML Service Dependency Injection
Overview
Service Dependency Injection is the mechanism by which multiple BentoML services are wired together without hard-coding their composition. Services declare their dependencies as class attributes, and the framework resolves them at runtime -- either as in-process proxies (local mode) or as inter-process RPC clients (distributed mode).
Detailed Explanation
Dependency injection (DI) is a well-established software design pattern that decouples the use of a dependency from its construction. In the context of BentoML's multi-model composition, DI provides several critical benefits:
Core Concepts
- Declaration over Construction: Services declare what they depend on, not how to create or connect to it. The framework handles instantiation, process management, and communication.
- Runtime Resolution: Dependencies are resolved at runtime based on the deployment context. In local development, a dependency may be an in-process method call. In production, the same declaration transparently becomes an RPC call to a separate process or remote service.
- Transparent Proxying: The dependency descriptor returns a proxy object that mirrors the dependent service's API. Callers interact with the proxy exactly as they would with the real service, regardless of whether the service is in-process or remote.
How It Works
The dependency injection lifecycle follows these stages:
| Stage | Description |
|---|---|
| Declaration | A service class attribute is assigned via bentoml.depends(ServiceClass). This creates a Dependency[T] descriptor.
|
| Discovery | When the parent service is loaded, the framework inspects its class attributes to find all Dependency descriptors. These are collected into the service's dependencies dict.
|
| Resolution | At serve time, the framework decides how to resolve each dependency: in-process (development), inter-process (local distributed), or remote (cloud deployment). |
| Proxying | The descriptor's __get__ method returns a proxy. When the proxy's methods are called, they delegate to the resolved service instance.
|
Benefits for Multi-Model Systems
- Testability: Services can be tested in isolation by mocking their dependencies. The DI pattern makes it trivial to substitute a real model service with a mock.
- Deployment Flexibility: The same composition code works across local development, staging, and production. Only the resolution strategy changes.
- Independent Deployment: Each service in the dependency graph can be deployed and scaled independently. The DI framework handles service discovery and routing.
- Type Safety: The
Dependency[T]generic type preserves the type signature of the dependent service, enabling IDE autocompletion and static type checking.
Local vs. Distributed Resolution
| Mode | Resolution Strategy | Communication |
|---|---|---|
| Local (dev) | In-process proxy | Direct method call |
| Local distributed | Inter-process proxy | Unix domain sockets / TCP |
| Cloud distributed | Remote RPC client | HTTP/gRPC over network |
In all cases, the calling code in the parent service remains identical. The framework transparently handles serialization, transport, and deserialization.
Relationship to Implementation
This principle is implemented by the bentoml.depends() function, which returns a Dependency[T] descriptor that resolves at runtime.
Implementation:Bentoml_BentoML_Depends_Function