Implementation:Tensorflow Serving Remote Predict Ops
| Knowledge Sources | |
|---|---|
| Domains | Python API, Remote Inference |
| Last Updated | 2026-02-13 00:00 GMT |
Overview
A Python wrapper module that provides high-level functions for invoking the RemotePredict TensorFlow op, enabling remote model inference from within a TensorFlow graph.
Description
This module provides two primary functions: run() and run_returning_status(). Both wrap the generated tf_serving_remote_predict op (loaded from a shared library) with a user-friendly API. The run() function sets fail_op_on_rpc_error=True and returns only the output tensors (index [2] of the op's outputs), raising on RPC errors. The run_returning_status() function sets fail_op_on_rpc_error=False and returns the full tuple of (status_code, status_error_message, output_tensors), allowing the caller to handle errors gracefully. Both functions accept input tensor aliases, input tensors, output tensor aliases, target_address, model_name, model_version (default -1 for latest), max_rpc_deadline_millis (default 3000), output_types, name, and signature_name (default "serving_default"). A ValueError is raised if model_name is not provided. The module loads the native op library via tf.load_op_library and re-exports the generated op module's symbols.
Usage
Use these functions within a TensorFlow 1.x graph to perform remote inference against a TensorFlow Serving model server, enabling model composition, ensemble models, or cascaded inference pipelines.
Code Reference
Source Location
- Repository: Tensorflow_Serving
- File:
tensorflow_serving/experimental/tensorflow/ops/remote_predict/python/ops/remote_predict_ops.py - Lines: 1-124
Signature
def run(input_tensor_alias, input_tensors, output_tensor_alias,
target_address, model_name, model_version=-1,
max_rpc_deadline_millis=3000, output_types=None, name=None,
signature_name='serving_default'):
...
def run_returning_status(input_tensor_alias, input_tensors, output_tensor_alias,
target_address, model_name, model_version=-1,
max_rpc_deadline_millis=3000, output_types=None,
name=None, signature_name='serving_default'):
...
Import
from tensorflow_serving.experimental.tensorflow.ops.remote_predict.python.ops import remote_predict_ops
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| input_tensor_alias | Tensor (string) |
Yes | Aliases for the input tensors |
| input_tensors | list of Tensor |
Yes | The input tensors to send |
| output_tensor_alias | Tensor (string) |
Yes | Aliases for the desired outputs |
| target_address | str |
Yes | Address of the remote serving instance |
| model_name | str |
Yes | Name of the model to call |
| model_version | int |
No | Model version; -1 uses the latest available |
| max_rpc_deadline_millis | int |
No | RPC timeout in milliseconds (default 3000) |
| output_types | list of DType |
No | Expected output tensor types |
| signature_name | str |
No | Signature name (default "serving_default") |
Outputs
| Name | Type | Description |
|---|---|---|
| run() | list of Tensor |
Output tensors from the remote prediction (raises on error) |
| run_returning_status() | tuple |
(status_code, status_error_message, output_tensors) |
Usage Examples
Simple Remote Predict
import tensorflow.compat.v1 as tf
from tensorflow_serving.experimental.tensorflow.ops.remote_predict.python.ops import remote_predict_ops
output_tensors = remote_predict_ops.run(
input_tensor_alias=["input"],
input_tensors=[my_input_tensor],
output_tensor_alias=["output"],
target_address="localhost:8500",
model_name="my_model",
output_types=[tf.float32])
Remote Predict with Status Handling
status_code, status_msg, outputs = remote_predict_ops.run_returning_status(
input_tensor_alias=["input"],
input_tensors=[my_input_tensor],
output_tensor_alias=["output"],
target_address="localhost:8500",
model_name="my_model",
output_types=[tf.float32])