Implementation:Tensorflow Serving caching manager cc

Knowledge Sources	Tensorflow_Serving
Domains	Model Serving, Core Framework
Last Updated	2026-02-13 00:00 GMT

Overview

caching_manager.cc contains the implementation of the CachingManager's on-demand loading logic, per-servable concurrency control, and the PathPrefixLoaderFactory.

Description

This file implements the core behavior of CachingManager:

Create(): Translates CachingManager::Options into BasicManager::Options, creates a BasicManager, and wraps it along with the LoaderFactory into a new CachingManager instance.

GetUntypedServableHandle(): The request entry point. If a specific version is requested, it calls GetUntypedServableHandleForId() directly. If no version is specified, it queries the LoaderFactory::GetServableVersion() for the appropriate version based on the auto-version policy.

GetUntypedServableHandleForId(): First attempts to retrieve the handle from the underlying BasicManager. If the servable is not found (NOT_FOUND error), it creates a loader via the factory and calls LoadServable() to manage and load it, then retrieves the handle again.

LoadServable(): The core concurrency-safe loading method. It uses a per-servable mutex map (load_mutex_map_) to ensure only one thread loads a given servable at a time:

Acquires the global load_mutex_map_mu_ to find or create a per-servable mutex.
Acquires the per-servable mutex.
Checks if the servable is already managed (via GetManagedServableStateSnapshot()).
If not managed, calls ManageServable() followed by a synchronous LoadServable() on the BasicManager.
Cleans up the mutex map entry if it is the last reference.

PathPrefixLoaderFactory: Implements CreateLoader() by joining the path prefix with the servable name. Only supports version 0; other versions produce a FailedPrecondition error. GetServableVersion() always returns 0.

Usage

This file is the implementation companion to caching_manager.h. It is compiled as part of the TensorFlow Serving core library. No direct inclusion is needed; include the header instead.

Code Reference

Source Location

Repository: Tensorflow_Serving
File: tensorflow_serving/core/caching_manager.cc
Lines: 1-229

Signature

// Key methods implemented:
absl::Status CachingManager::Create(
    Options options,
    std::unique_ptr<LoaderFactory> loader_factory,
    std::unique_ptr<CachingManager>* caching_manager);

absl::Status CachingManager::GetUntypedServableHandle(
    const ServableRequest& request,
    std::unique_ptr<UntypedServableHandle>* const handle);

absl::Status CachingManager::LoadServable(
    ServableData<std::unique_ptr<Loader>> loader_data);

ServableData<std::unique_ptr<Loader>> PathPrefixLoaderFactory::CreateLoader(
    const ServableId& id);

Import

#include "tensorflow_serving/core/caching_manager.h"

I/O Contract

Inputs

Name	Type	Required	Description
options	CachingManager::Options	Yes	Configuration for the underlying BasicManager
loader_factory	std::unique_ptr<LoaderFactory>	Yes	Factory for on-demand loader creation
request	ServableRequest	Yes	Request with servable name and optional version
loader_data	ServableData<std::unique_ptr<Loader>>	Yes (internal)	Loader data to transfer to BasicManager for loading

Outputs

Name	Type	Description
Create()	absl::Status	OK with constructed CachingManager; error on BasicManager creation failure
GetUntypedServableHandle()	absl::Status	OK with handle; NOT_FOUND triggers load; propagates load errors
LoadServable()	absl::Status	OK if loaded or already loaded; Internal error on manage/load failure
GetAvailableUntypedServableHandles()	std::map	Delegates to BasicManager
ListAvailableServableIds()	std::vector<ServableId>	Delegates to BasicManager

Usage Examples

On-Demand Loading Flow

// Internal flow when a request arrives for an unloaded servable:
//
// 1. GetUntypedServableHandle(request) is called
// 2. BasicManager returns NOT_FOUND
// 3. LoaderFactory::CreateLoader(servable_id) produces loader_data
// 4. LoadServable(loader_data) is called:
//    a. Per-servable mutex acquired
//    b. BasicManager::ManageServable(loader_data)
//    c. BasicManager::LoadServable(servable_id, callback)
//    d. Wait for load_done notification
// 5. BasicManager::GetUntypedServableHandle() succeeds
// 6. Handle returned to caller

PathPrefixLoaderFactory Path Construction

// Given path_prefix = "/models/" and servable_id = {"my_model", 0}
// The constructed path is: "/models/my_model"
// The adapter then creates a loader from that path.

auto factory = std::make_unique<PathPrefixLoaderFactory>(
    "/models/", std::move(adapter));
auto loader_data = factory->CreateLoader({"my_model", 0});
// loader_data contains a loader for path "/models/my_model"

Related Pages

Principle:Tensorflow_Serving_Servable_Caching

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment