Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Tensorflow Serving load servables fast h

From Leeroopedia
Revision as of 13:54, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Tensorflow_Serving_load_servables_fast_h.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains Model Serving, Core Framework
Last Updated 2026-02-13 00:00 GMT

Overview

ConnectSourceWithFastInitialLoad connects sources to a manager with a temporarily boosted thread pool for fast initial servable loading.

Description

The functions in this header accelerate the initial loading of servables during server startup. The strategy is to temporarily increase the number of load threads in the AspiredVersionsManager to a high value (default: 4 times the number of schedulable CPUs), connect the source(s) to the manager, wait until all initial servables are loaded, and then reset the thread count to the manager's original configured value.

Two functions are provided:

  • ConnectSourceWithFastInitialLoad() - For a single source.
  • ConnectSourcesWithFastInitialLoad() - For multiple sources.

Both take the manager, source(s), a ServableStateMonitor for detecting when loading is complete, a list of ServableRequest describing the initial servables to wait for, and an optional thread count override.

The internal namespace exposes helper functions for testing: GetManagerNumLoadThreads() and SetManagerNumLoadThreadsNotifier().

Usage

Use these functions during server initialization when you want to load the initial set of servables as quickly as possible by leveraging maximum CPU parallelism, without keeping that elevated thread count for the lifetime of the server.

Code Reference

Source Location

  • Repository: Tensorflow_Serving
  • File: tensorflow_serving/core/load_servables_fast.h
  • Lines: 1-70

Signature

Status ConnectSourceWithFastInitialLoad(
    AspiredVersionsManager* manager,
    Source<std::unique_ptr<Loader>>* source,
    ServableStateMonitor* servable_state_monitor,
    const std::vector<ServableRequest>& initial_servables,
    uint32 num_threads = 4 * port::NumSchedulableCPUs());

Status ConnectSourcesWithFastInitialLoad(
    AspiredVersionsManager* manager,
    std::vector<Source<std::unique_ptr<Loader>>*> sources,
    ServableStateMonitor* servable_state_monitor,
    const std::vector<ServableRequest>& initial_servables,
    uint32 num_threads = 4 * port::NumSchedulableCPUs());

Import

#include "tensorflow_serving/core/load_servables_fast.h"

I/O Contract

Inputs

Name Type Required Description
manager AspiredVersionsManager* Yes The manager to connect sources to and temporarily boost threads on
source / sources Source<std::unique_ptr<Loader>>* / vector Yes The source(s) to connect to the manager
servable_state_monitor ServableStateMonitor* Yes Monitor used to detect when initial servables are loaded
initial_servables const std::vector<ServableRequest>& Yes The set of servables to wait for before reverting thread count
num_threads uint32 No Number of temporary load threads; defaults to 4 * NumSchedulableCPUs()

Outputs

Name Type Description
return Status OK if all initial servables loaded successfully; error otherwise

Usage Examples

Fast Initial Load at Server Startup

#include "tensorflow_serving/core/load_servables_fast.h"

using namespace tensorflow::serving;

// Assume manager, source, and monitor are already set up
std::vector<ServableRequest> initial_servables = {
    ServableRequest::Latest("model_a"),
    ServableRequest::Latest("model_b"),
};

TF_CHECK_OK(ConnectSourceWithFastInitialLoad(
    manager.get(), source.get(), &servable_state_monitor,
    initial_servables));
// All initial servables are now loaded, threads reverted to original count

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment