Implementation:Tensorflow Serving ProcessPredictRequest
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Inference, Networking |
| Last Updated | 2026-02-13 17:00 GMT |
Overview
Concrete tool for processing REST predict requests end-to-end, from JSON parsing through TensorFlow session execution to JSON response generation.
Description
HttpRestApiHandler::ProcessPredictRequest() orchestrates the full predict pipeline:
- Calls FillPredictRequestFromJson() to parse the JSON body into a PredictRequest proto
- Calls predictor_->Predict() (TensorflowPredictor::Predict()) which:
- Resolves the model via ServerCore::GetServableHandle()
- Calls RunPredict() in predict_util.cc
- PreProcessPrediction() validates inputs and extracts tensor names
- session->Run() executes the TensorFlow graph
- PostProcessPredictionResult() packages output tensors
- Calls MakeJsonFromTensors() to convert output TensorProtos to JSON
Usage
Triggered automatically when a POST request arrives at /v1/models/{name}:predict on the REST endpoint.
Code Reference
Source Location
- Repository: tensorflow/serving
- File: tensorflow_serving/model_servers/http_rest_api_handler.cc L152-178
- Predictor: tensorflow_serving/servables/tensorflow/predict_impl.cc L33-59
- Core execution: tensorflow_serving/servables/tensorflow/predict_util.cc L78-190
Signature
// REST handler entry point
Status HttpRestApiHandler::ProcessPredictRequest(
const absl::string_view model_name,
const absl::optional<int64_t>& model_version,
const absl::optional<absl::string_view>& model_version_label,
const absl::string_view request_body,
string* output
);
// Core predict execution
absl::Status internal::RunPredict(
const RunOptions& run_options,
const MetaGraphDef& meta_graph_def,
const absl::optional<int64_t>& servable_version,
const absl::optional<Tensor>& option,
Session* session,
const PredictRequest& request,
PredictResponse* response,
const thread::ThreadPoolOptions& thread_pool_options
);
Import
#include "tensorflow_serving/model_servers/http_rest_api_handler.h"
#include "tensorflow_serving/servables/tensorflow/predict_util.h"
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | string_view | Yes | Name of the model to query |
| model_version | optional<int64_t> | No | Specific version (from URL) |
| model_version_label | optional<string_view> | No | Version label (from URL) |
| request_body | string_view | Yes | Raw JSON request body |
Outputs
| Name | Type | Description |
|---|---|---|
| output | string* | JSON response: {"predictions": [...]} or {"outputs": {...}} |
Usage Examples
Predict Request
# Predict with row format
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
http://localhost:8501/v1/models/my_model:predict
# Predict with specific version
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
http://localhost:8501/v1/models/my_model/versions/2:predict
# Predict with version label
curl -d '{"instances": [[1.0, 2.0, 3.0]]}' \
http://localhost:8501/v1/models/my_model/labels/stable:predict
# Classify request
curl -d '{"instances": [{"x": [1.0, 2.0]}]}' \
http://localhost:8501/v1/models/my_model:classify
# Regress request
curl -d '{"instances": [{"x": [1.0, 2.0]}]}' \
http://localhost:8501/v1/models/my_model:regress
Related Pages
Implements Principle
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment