Implementation:Tensorflow Serving caching manager cc
| Knowledge Sources | |
|---|---|
| Domains | Model Serving, Core Framework |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
caching_manager.cc contains the implementation of the CachingManager's on-demand loading logic, per-servable concurrency control, and the PathPrefixLoaderFactory.
Description
This file implements the core behavior of CachingManager:
Create(): Translates CachingManager::Options into BasicManager::Options, creates a BasicManager, and wraps it along with the LoaderFactory into a new CachingManager instance.
GetUntypedServableHandle(): The request entry point. If a specific version is requested, it calls GetUntypedServableHandleForId() directly. If no version is specified, it queries the LoaderFactory::GetServableVersion() for the appropriate version based on the auto-version policy.
GetUntypedServableHandleForId(): First attempts to retrieve the handle from the underlying BasicManager. If the servable is not found (NOT_FOUND error), it creates a loader via the factory and calls LoadServable() to manage and load it, then retrieves the handle again.
LoadServable(): The core concurrency-safe loading method. It uses a per-servable mutex map (load_mutex_map_) to ensure only one thread loads a given servable at a time:
- Acquires the global
load_mutex_map_mu_to find or create a per-servable mutex. - Acquires the per-servable mutex.
- Checks if the servable is already managed (via
GetManagedServableStateSnapshot()). - If not managed, calls
ManageServable()followed by a synchronousLoadServable()on the BasicManager. - Cleans up the mutex map entry if it is the last reference.
PathPrefixLoaderFactory: Implements CreateLoader() by joining the path prefix with the servable name. Only supports version 0; other versions produce a FailedPrecondition error. GetServableVersion() always returns 0.
Usage
This file is the implementation companion to caching_manager.h. It is compiled as part of the TensorFlow Serving core library. No direct inclusion is needed; include the header instead.
Code Reference
Source Location
- Repository: Tensorflow_Serving
- File: tensorflow_serving/core/caching_manager.cc
- Lines: 1-229
Signature
// Key methods implemented:
absl::Status CachingManager::Create(
Options options,
std::unique_ptr<LoaderFactory> loader_factory,
std::unique_ptr<CachingManager>* caching_manager);
absl::Status CachingManager::GetUntypedServableHandle(
const ServableRequest& request,
std::unique_ptr<UntypedServableHandle>* const handle);
absl::Status CachingManager::LoadServable(
ServableData<std::unique_ptr<Loader>> loader_data);
ServableData<std::unique_ptr<Loader>> PathPrefixLoaderFactory::CreateLoader(
const ServableId& id);
Import
#include "tensorflow_serving/core/caching_manager.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| options | CachingManager::Options | Yes | Configuration for the underlying BasicManager |
| loader_factory | std::unique_ptr<LoaderFactory> | Yes | Factory for on-demand loader creation |
| request | ServableRequest | Yes | Request with servable name and optional version |
| loader_data | ServableData<std::unique_ptr<Loader>> | Yes (internal) | Loader data to transfer to BasicManager for loading |
Outputs
| Name | Type | Description |
|---|---|---|
| Create() | absl::Status | OK with constructed CachingManager; error on BasicManager creation failure |
| GetUntypedServableHandle() | absl::Status | OK with handle; NOT_FOUND triggers load; propagates load errors |
| LoadServable() | absl::Status | OK if loaded or already loaded; Internal error on manage/load failure |
| GetAvailableUntypedServableHandles() | std::map | Delegates to BasicManager |
| ListAvailableServableIds() | std::vector<ServableId> | Delegates to BasicManager |
Usage Examples
On-Demand Loading Flow
// Internal flow when a request arrives for an unloaded servable:
//
// 1. GetUntypedServableHandle(request) is called
// 2. BasicManager returns NOT_FOUND
// 3. LoaderFactory::CreateLoader(servable_id) produces loader_data
// 4. LoadServable(loader_data) is called:
// a. Per-servable mutex acquired
// b. BasicManager::ManageServable(loader_data)
// c. BasicManager::LoadServable(servable_id, callback)
// d. Wait for load_done notification
// 5. BasicManager::GetUntypedServableHandle() succeeds
// 6. Handle returned to caller
PathPrefixLoaderFactory Path Construction
// Given path_prefix = "/models/" and servable_id = {"my_model", 0}
// The constructed path is: "/models/my_model"
// The adapter then creates a loader from that path.
auto factory = std::make_unique<PathPrefixLoaderFactory>(
"/models/", std::move(adapter));
auto loader_data = factory->CreateLoader({"my_model", 0});
// loader_data contains a loader for path "/models/my_model"