Implementation:Tensorflow Serving ProcessPredictRequest

Knowledge Sources	TensorFlow Serving
Domains	Inference, Networking
Last Updated	2026-02-13 17:00 GMT

Overview

Concrete tool for processing REST predict requests end-to-end, from JSON parsing through TensorFlow session execution to JSON response generation.

Description

HttpRestApiHandler::ProcessPredictRequest() orchestrates the full predict pipeline:

Calls FillPredictRequestFromJson() to parse the JSON body into a PredictRequest proto
Calls predictor_->Predict() (TensorflowPredictor::Predict()) which:
1. Resolves the model via ServerCore::GetServableHandle()
2. Calls RunPredict() in predict_util.cc
3. PreProcessPrediction() validates inputs and extracts tensor names
4. session->Run() executes the TensorFlow graph
5. PostProcessPredictionResult() packages output tensors
Calls MakeJsonFromTensors() to convert output TensorProtos to JSON

Usage

Triggered automatically when a POST request arrives at /v1/models/{name}:predict on the REST endpoint.

Code Reference

Source Location

Repository: tensorflow/serving
File: tensorflow_serving/model_servers/http_rest_api_handler.cc L152-178
Predictor: tensorflow_serving/servables/tensorflow/predict_impl.cc L33-59
Core execution: tensorflow_serving/servables/tensorflow/predict_util.cc L78-190

Signature

// REST handler entry point
Status HttpRestApiHandler::ProcessPredictRequest(
    const absl::string_view model_name,
    const absl::optional<int64_t>& model_version,
    const absl::optional<absl::string_view>& model_version_label,
    const absl::string_view request_body,
    string* output
);

// Core predict execution
absl::Status internal::RunPredict(
    const RunOptions& run_options,
    const MetaGraphDef& meta_graph_def,
    const absl::optional<int64_t>& servable_version,
    const absl::optional<Tensor>& option,
    Session* session,
    const PredictRequest& request,
    PredictResponse* response,
    const thread::ThreadPoolOptions& thread_pool_options
);

Import

#include "tensorflow_serving/model_servers/http_rest_api_handler.h"
#include "tensorflow_serving/servables/tensorflow/predict_util.h"

I/O Contract

Inputs

Name	Type	Required	Description
model_name	string_view	Yes	Name of the model to query
model_version	optional<int64_t>	No	Specific version (from URL)
model_version_label	optional<string_view>	No	Version label (from URL)
request_body	string_view	Yes	Raw JSON request body

Outputs

Name	Type	Description
output	string*	JSON response: {"predictions": [...]} or {"outputs": {...}}

Usage Examples

Predict Request

# Predict with row format
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
    http://localhost:8501/v1/models/my_model:predict

# Predict with specific version
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
    http://localhost:8501/v1/models/my_model/versions/2:predict

# Predict with version label
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
    http://localhost:8501/v1/models/my_model/labels/stable:predict

# Classify request
curl -d '{"instances": [{"x": [1.0, 2.0]}]}' \
    http://localhost:8501/v1/models/my_model:classify

# Regress request
curl -d '{"instances": [{"x": [1.0, 2.0]}]}' \
    http://localhost:8501/v1/models/my_model:regress

Related Pages

Implements Principle

Principle:Tensorflow_Serving_REST_Inference_Execution

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment