Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Tensorflow Serving caching manager cc

From Leeroopedia
Revision as of 13:54, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Tensorflow_Serving_caching_manager_cc.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Model Serving, Core Framework
Last Updated 2026-02-13 00:00 GMT

Overview

caching_manager.cc contains the implementation of the CachingManager's on-demand loading logic, per-servable concurrency control, and the PathPrefixLoaderFactory.

Description

This file implements the core behavior of CachingManager:

Create(): Translates CachingManager::Options into BasicManager::Options, creates a BasicManager, and wraps it along with the LoaderFactory into a new CachingManager instance.

GetUntypedServableHandle(): The request entry point. If a specific version is requested, it calls GetUntypedServableHandleForId() directly. If no version is specified, it queries the LoaderFactory::GetServableVersion() for the appropriate version based on the auto-version policy.

GetUntypedServableHandleForId(): First attempts to retrieve the handle from the underlying BasicManager. If the servable is not found (NOT_FOUND error), it creates a loader via the factory and calls LoadServable() to manage and load it, then retrieves the handle again.

LoadServable(): The core concurrency-safe loading method. It uses a per-servable mutex map (load_mutex_map_) to ensure only one thread loads a given servable at a time:

  1. Acquires the global load_mutex_map_mu_ to find or create a per-servable mutex.
  2. Acquires the per-servable mutex.
  3. Checks if the servable is already managed (via GetManagedServableStateSnapshot()).
  4. If not managed, calls ManageServable() followed by a synchronous LoadServable() on the BasicManager.
  5. Cleans up the mutex map entry if it is the last reference.

PathPrefixLoaderFactory: Implements CreateLoader() by joining the path prefix with the servable name. Only supports version 0; other versions produce a FailedPrecondition error. GetServableVersion() always returns 0.

Usage

This file is the implementation companion to caching_manager.h. It is compiled as part of the TensorFlow Serving core library. No direct inclusion is needed; include the header instead.

Code Reference

Source Location

  • Repository: Tensorflow_Serving
  • File: tensorflow_serving/core/caching_manager.cc
  • Lines: 1-229

Signature

// Key methods implemented:
absl::Status CachingManager::Create(
    Options options,
    std::unique_ptr<LoaderFactory> loader_factory,
    std::unique_ptr<CachingManager>* caching_manager);

absl::Status CachingManager::GetUntypedServableHandle(
    const ServableRequest& request,
    std::unique_ptr<UntypedServableHandle>* const handle);

absl::Status CachingManager::LoadServable(
    ServableData<std::unique_ptr<Loader>> loader_data);

ServableData<std::unique_ptr<Loader>> PathPrefixLoaderFactory::CreateLoader(
    const ServableId& id);

Import

#include "tensorflow_serving/core/caching_manager.h"

I/O Contract

Inputs

Name Type Required Description
options CachingManager::Options Yes Configuration for the underlying BasicManager
loader_factory std::unique_ptr<LoaderFactory> Yes Factory for on-demand loader creation
request ServableRequest Yes Request with servable name and optional version
loader_data ServableData<std::unique_ptr<Loader>> Yes (internal) Loader data to transfer to BasicManager for loading

Outputs

Name Type Description
Create() absl::Status OK with constructed CachingManager; error on BasicManager creation failure
GetUntypedServableHandle() absl::Status OK with handle; NOT_FOUND triggers load; propagates load errors
LoadServable() absl::Status OK if loaded or already loaded; Internal error on manage/load failure
GetAvailableUntypedServableHandles() std::map Delegates to BasicManager
ListAvailableServableIds() std::vector<ServableId> Delegates to BasicManager

Usage Examples

On-Demand Loading Flow

// Internal flow when a request arrives for an unloaded servable:
//
// 1. GetUntypedServableHandle(request) is called
// 2. BasicManager returns NOT_FOUND
// 3. LoaderFactory::CreateLoader(servable_id) produces loader_data
// 4. LoadServable(loader_data) is called:
//    a. Per-servable mutex acquired
//    b. BasicManager::ManageServable(loader_data)
//    c. BasicManager::LoadServable(servable_id, callback)
//    d. Wait for load_done notification
// 5. BasicManager::GetUntypedServableHandle() succeeds
// 6. Handle returned to caller

PathPrefixLoaderFactory Path Construction

// Given path_prefix = "/models/" and servable_id = {"my_model", 0}
// The constructed path is: "/models/my_model"
// The adapter then creates a loader from that path.

auto factory = std::make_unique<PathPrefixLoaderFactory>(
    "/models/", std::move(adapter));
auto loader_data = factory->CreateLoader({"my_model", 0});
// loader_data contains a loader for path "/models/my_model"

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment