Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Tensorflow Serving Tfrt Predict Util

From Leeroopedia
Knowledge Sources
Domains Model Serving, Prediction
Last Updated 2026-02-13 00:00 GMT

Overview

Implements TFRT-based prediction with support for output filtering, tensor serialization options, and custom thread pool configuration.

Description

The TFRT Predict Util module provides the primary predict execution path for TFRT SavedModels. It exposes two versions of RunPredict: an internal version that accepts a PredictResponseTensorSerializationOption (kAsProtoField or kAsProtoContent) and a public version that defaults to kAsProtoField for backward compatibility.

The implementation takes two distinct code paths based on whether output filtering is needed. When no output filter is specified (or the filter matches all outputs), it uses the optimized PreProcessPredictionWithoutOutputFilter path that validates inputs against function metadata, handles default input values, checks data types, and invokes the model via SavedModel::Run. When an output filter is specified that selects a subset of outputs, it falls back to RunByTensorNames which uses the MetaGraphDef signature definitions for tensor name resolution, enabling lazy initialization of optimized subgraphs.

The module also supports custom thread pool options via TfThreadPoolWorkQueue when an inter-op thread pool is provided, and records runtime latency metrics for monitoring. Post-processing serializes output tensors into the PredictResponse using the specified serialization option, applying the output filter when present.

Usage

Use this module for predict requests through the TFRT runtime. It is the core predict function called by TfrtSavedModelServable::Predict and during model warmup. The internal variant is used when tensor serialization format control is needed (e.g., kAsProtoContent for bandwidth optimization).

Code Reference

Source Location

  • Repository: Tensorflow_Serving
  • Files:
    • tensorflow_serving/servables/tensorflow/tfrt_predict_util.h (lines 1-57)
    • tensorflow_serving/servables/tensorflow/tfrt_predict_util.cc (lines 1-283)

Signature

namespace internal {
Status RunPredict(
    const tfrt::SavedModel::RunOptions& run_options,
    const absl::optional<int64_t>& servable_version,
    const PredictResponseTensorSerializationOption tensor_serialization_option,
    tfrt::SavedModel* saved_model, const PredictRequest& request,
    PredictResponse* response,
    const thread::ThreadPoolOptions& thread_pool_options =
        thread::ThreadPoolOptions());
}  // namespace internal

Status RunPredict(const tfrt::SavedModel::RunOptions& run_options,
                  const absl::optional<int64_t>& servable_version,
                  tfrt::SavedModel* saved_model, const PredictRequest& request,
                  PredictResponse* response,
                  const thread::ThreadPoolOptions& thread_pool_options =
                      thread::ThreadPoolOptions());

Import

#include "tensorflow_serving/servables/tensorflow/tfrt_predict_util.h"

I/O Contract

Inputs

Name Type Required Description
run_options tfrt::SavedModel::RunOptions Yes Runtime options including deadline and validation settings
servable_version absl::optional<int64_t> No Version to set in the response ModelSpec
saved_model tfrt::SavedModel* Yes Loaded TFRT SavedModel
request PredictRequest Yes Predict request with model_spec, input tensors map, and optional output_filter
thread_pool_options thread::ThreadPoolOptions No Optional custom thread pools for inter/intra-op parallelism

Outputs

Name Type Description
response PredictResponse* Populated response with model_spec and output tensors map
return Status OK on success; FailedPrecondition if function not found; InvalidArgument for missing/mistyped inputs or invalid output_filter

Usage Examples

Basic Predict Call

tfrt::SavedModel::RunOptions run_options;
PredictRequest request;
request.mutable_model_spec()->set_name("my_model");
(*request.mutable_inputs())["input"].CopyFrom(input_tensor_proto);

PredictResponse response;
Status status = RunPredict(run_options, /*servable_version=*/1,
                           saved_model, request, &response);

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment