Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Tensorflow Serving Util

From Leeroopedia
Knowledge Sources
Domains Model Serving, Utilities, Monitoring
Last Updated 2026-02-13 00:00 GMT

Overview

Provides shared utility functions for TensorFlow Serving including input serialization, one-shot tensor computation, model spec construction, resource estimation, latency recording, and metric tracking.

Description

The Util module is a foundational utility library used across TensorFlow Serving's inference pipeline. Key functions include:

Input Processing:

  • InputToSerializedExampleTensor: Converts an Input protobuf (ExampleList or ExampleListWithContext) into a string Tensor of serialized examples. Uses an optimized serialization path via SerializedInput to avoid costly lazy deserialization of protobuf fields. Handles context merging for ExampleListWithContext by concatenating serialized context and example bytes.

Session Execution:

  • PerformOneShotTensorComputation (two overloads): Convenience functions that serialize input, run a session, and return outputs. The first overload accepts a single input tensor name; the second accepts a set of input names (each fed the same input tensor). Both support custom thread pools and optional runtime latency output.

Model Spec:

  • MakeModelSpec: Populates a ModelSpec protobuf with model name, optional signature name (defaulting to the default serving signature key), and optional version.

Resource Estimation:

  • GetModelDiskSize: Performs a parallelized BFS traversal of the model directory using a ThreadPoolExecutor (256 threads) to efficiently calculate total file size.
  • EstimateResourceFromPathUsingDiskState: Estimates RAM requirements using the formula: RAM = disk_size * 1.2 + 0 bytes padding.

Monitoring:

  • RecordRequestExampleCount: Records example count per request in a histogram and counter metric.
  • RecordModelRequestCount: Records per-model request counts by status code.
  • RecordRuntimeLatency: Records TF/TFRT runtime latency by model name, API, and runtime.
  • RecordRequestLatency: Records request-level latency by model name, API, and entrypoint.
  • SetSignatureMethodNameCheckFeature / GetSignatureMethodNameCheckFeature: Feature flag for enabling/disabling method_name checks on SignatureDefs (disabled for native TF2 models).
  • IsTfrtErrorLoggingEnabled: Checks the ENABLE_TFRT_SERVING_ERROR_LOGGING environment variable.

Set Operations:

  • GetMapKeys (template): Extracts string keys from a map.
  • SetDifference: Computes set difference (A \ B).

Usage

Use these utilities throughout the serving pipeline. They are consumed by classifiers, regressors, predict utilities, multi-inference runners, and the model warmup infrastructure. The monitoring functions are critical for observability in production deployments.

Code Reference

Source Location

  • Repository: Tensorflow_Serving
  • Files:
    • tensorflow_serving/servables/tensorflow/util.h (lines 1-144)
    • tensorflow_serving/servables/tensorflow/util.cc (lines 1-375)

Signature

Status InputToSerializedExampleTensor(const Input& input, Tensor* examples);

Status PerformOneShotTensorComputation(
    const RunOptions& run_options, const Input& input,
    const string& input_tensor_name,
    const std::vector<string>& output_tensor_names, Session* session,
    std::vector<Tensor>* outputs, int* num_input_examples,
    const thread::ThreadPoolOptions& thread_pool_options =
        thread::ThreadPoolOptions(),
    int64_t* runtime_latency = nullptr);

void MakeModelSpec(const string& model_name,
                   const absl::optional<string>& signature_name,
                   const absl::optional<int64_t>& version,
                   ModelSpec* model_spec);

void RecordRuntimeLatency(const string& model_name, const string& api,
                          const string& runtime, int64_t latency_usec);

void RecordRequestExampleCount(const string& model_name, size_t count);

Status GetModelDiskSize(const string& path, FileProbingEnv* env,
                        uint64_t* total_file_size);

Import

#include "tensorflow_serving/servables/tensorflow/util.h"

I/O Contract

Inputs

Name Type Required Description
input Input Yes Input protobuf containing ExampleList or ExampleListWithContext
session Session* Yes Active TensorFlow session (for PerformOneShotTensorComputation)
model_name string Yes Model name for metrics and ModelSpec
path string Yes Model directory path (for disk size estimation)

Outputs

Name Type Description
examples Tensor* String tensor of serialized examples with shape {num_examples}
outputs vector<Tensor>* Session::Run output tensors
model_spec ModelSpec* Populated model specification protobuf
total_file_size uint64_t* Total disk size of the model directory in bytes

Usage Examples

Serializing Input and Running Computation

Input input;
// Populate input with examples
Tensor serialized_examples;
TF_RETURN_IF_ERROR(InputToSerializedExampleTensor(input, &serialized_examples));

std::vector<Tensor> outputs;
int num_examples;
TF_RETURN_IF_ERROR(PerformOneShotTensorComputation(
    RunOptions(), input, "serving_default_inputs",
    {"StatefulPartitionedCall:0"}, session, &outputs, &num_examples));
RecordRequestExampleCount("my_model", num_examples);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment