Implementation:Tensorflow Serving Util

Knowledge Sources	Tensorflow_Serving
Domains	Model Serving, Utilities, Monitoring
Last Updated	2026-02-13 00:00 GMT

Overview

Provides shared utility functions for TensorFlow Serving including input serialization, one-shot tensor computation, model spec construction, resource estimation, latency recording, and metric tracking.

Description

The Util module is a foundational utility library used across TensorFlow Serving's inference pipeline. Key functions include:

Input Processing:

InputToSerializedExampleTensor: Converts an Input protobuf (ExampleList or ExampleListWithContext) into a string Tensor of serialized examples. Uses an optimized serialization path via SerializedInput to avoid costly lazy deserialization of protobuf fields. Handles context merging for ExampleListWithContext by concatenating serialized context and example bytes.

Session Execution:

PerformOneShotTensorComputation (two overloads): Convenience functions that serialize input, run a session, and return outputs. The first overload accepts a single input tensor name; the second accepts a set of input names (each fed the same input tensor). Both support custom thread pools and optional runtime latency output.

Model Spec:

MakeModelSpec: Populates a ModelSpec protobuf with model name, optional signature name (defaulting to the default serving signature key), and optional version.

Resource Estimation:

GetModelDiskSize: Performs a parallelized BFS traversal of the model directory using a ThreadPoolExecutor (256 threads) to efficiently calculate total file size.
EstimateResourceFromPathUsingDiskState: Estimates RAM requirements using the formula: RAM = disk_size * 1.2 + 0 bytes padding.

Monitoring:

RecordRequestExampleCount: Records example count per request in a histogram and counter metric.
RecordModelRequestCount: Records per-model request counts by status code.
RecordRuntimeLatency: Records TF/TFRT runtime latency by model name, API, and runtime.
RecordRequestLatency: Records request-level latency by model name, API, and entrypoint.
SetSignatureMethodNameCheckFeature / GetSignatureMethodNameCheckFeature: Feature flag for enabling/disabling method_name checks on SignatureDefs (disabled for native TF2 models).
IsTfrtErrorLoggingEnabled: Checks the ENABLE_TFRT_SERVING_ERROR_LOGGING environment variable.

Set Operations:

GetMapKeys (template): Extracts string keys from a map.
SetDifference: Computes set difference (A \ B).

Usage

Use these utilities throughout the serving pipeline. They are consumed by classifiers, regressors, predict utilities, multi-inference runners, and the model warmup infrastructure. The monitoring functions are critical for observability in production deployments.

Code Reference

Source Location

Repository: Tensorflow_Serving
Files:
- tensorflow_serving/servables/tensorflow/util.h (lines 1-144)
- tensorflow_serving/servables/tensorflow/util.cc (lines 1-375)

Signature

Status InputToSerializedExampleTensor(const Input& input, Tensor* examples);

Status PerformOneShotTensorComputation(
    const RunOptions& run_options, const Input& input,
    const string& input_tensor_name,
    const std::vector<string>& output_tensor_names, Session* session,
    std::vector<Tensor>* outputs, int* num_input_examples,
    const thread::ThreadPoolOptions& thread_pool_options =
        thread::ThreadPoolOptions(),
    int64_t* runtime_latency = nullptr);

void MakeModelSpec(const string& model_name,
                   const absl::optional<string>& signature_name,
                   const absl::optional<int64_t>& version,
                   ModelSpec* model_spec);

void RecordRuntimeLatency(const string& model_name, const string& api,
                          const string& runtime, int64_t latency_usec);

void RecordRequestExampleCount(const string& model_name, size_t count);

Status GetModelDiskSize(const string& path, FileProbingEnv* env,
                        uint64_t* total_file_size);

Import

#include "tensorflow_serving/servables/tensorflow/util.h"

I/O Contract

Inputs

Name	Type	Required	Description
input	`Input`	Yes	Input protobuf containing ExampleList or ExampleListWithContext
session	`Session*`	Yes	Active TensorFlow session (for PerformOneShotTensorComputation)
model_name	`string`	Yes	Model name for metrics and ModelSpec
path	`string`	Yes	Model directory path (for disk size estimation)

Outputs

Name	Type	Description
examples	`Tensor*`	String tensor of serialized examples with shape {num_examples}
outputs	`vector<Tensor>*`	Session::Run output tensors
model_spec	`ModelSpec*`	Populated model specification protobuf
total_file_size	`uint64_t*`	Total disk size of the model directory in bytes

Usage Examples

Serializing Input and Running Computation

Input input;
// Populate input with examples
Tensor serialized_examples;
TF_RETURN_IF_ERROR(InputToSerializedExampleTensor(input, &serialized_examples));

std::vector<Tensor> outputs;
int num_examples;
TF_RETURN_IF_ERROR(PerformOneShotTensorComputation(
    RunOptions(), input, "serving_default_inputs",
    {"StatefulPartitionedCall:0"}, session, &outputs, &num_examples));
RecordRequestExampleCount("my_model", num_examples);

Related Pages

Principle:Tensorflow_Serving_Serving_Utilities

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment