Overview
Concrete tool for interacting with TorchServe inference and management gRPC APIs for predictions, model registration, and model lifecycle management provided by the KServe sample code.
Description
This module implements a comprehensive gRPC client for TorchServe with the following functions:
- get_inference_stub() -- Creates a gRPC channel to the inference API and returns an
InferenceAPIsServiceStub for making prediction and health check calls.
- get_management_stub() -- Creates a gRPC channel to the management API and returns a
ManagementAPIsServiceStub for model registration and lifecycle operations.
- infer() -- Reads a binary input file and sends a
PredictionsRequest to the inference stub, printing the decoded prediction result.
- ping() -- Sends a health check (
TorchServeHealthResponse) via the inference stub and prints the status.
- register() -- Registers a model with TorchServe by submitting a
RegisterModelRequest with the MAR file URL, initial workers, and model name. Checks a provided set of available MAR files and falls back to the TorchServe S3 URL if not found locally.
- unregister() -- Unregisters a model from TorchServe by name.
The __main__ block provides CLI argument parsing for host, port, hostname, model name, API name (infer or ping), and input path.
Usage
Use this script as a gRPC client to interact with TorchServe models deployed on KServe, supporting inference requests, health checks, model registration, and model unregistration.
Code Reference
Source Location
Signature
def get_inference_stub(host, port, hostname):
...
def get_management_stub(host, port, hostname):
...
def infer(stub, model_name, model_input):
...
def ping(stub):
...
def register(stub, model_name, mar_set_str):
...
def unregister(stub, model_name):
...
Import
from torchserve_grpc_client import get_inference_stub, infer, ping, register, unregister
I/O Contract
Inputs
get_inference_stub()
| Name |
Type |
Required |
Description
|
| host |
str |
Yes |
Ingress host name or IP address
|
| port |
int |
Yes |
Ingress port number
|
| hostname |
str |
Yes |
Service host name for gRPC SSL target name override
|
get_management_stub()
| Name |
Type |
Required |
Description
|
| host |
str |
Yes |
Ingress host name or IP address
|
| port |
int |
Yes |
Ingress port number
|
| hostname |
str |
Yes |
Service host name for gRPC SSL target name override
|
infer()
| Name |
Type |
Required |
Description
|
| stub |
InferenceAPIsServiceStub |
Yes |
The gRPC inference stub
|
| model_name |
str |
Yes |
Name of the TorchServe model to query
|
| model_input |
str |
Yes |
File path to the binary input data
|
ping()
| Name |
Type |
Required |
Description
|
| stub |
InferenceAPIsServiceStub |
Yes |
The gRPC inference stub for the health check
|
register()
| Name |
Type |
Required |
Description
|
| stub |
ManagementAPIsServiceStub |
Yes |
The gRPC management stub
|
| model_name |
str |
Yes |
Name of the model to register
|
| mar_set_str |
str |
No |
Comma-separated string of available MAR filenames
|
unregister()
| Name |
Type |
Required |
Description
|
| stub |
ManagementAPIsServiceStub |
Yes |
The gRPC management stub
|
| model_name |
str |
Yes |
Name of the model to unregister
|
Outputs
get_inference_stub()
| Name |
Type |
Description
|
| stub |
InferenceAPIsServiceStub |
gRPC stub for making inference calls
|
get_management_stub()
| Name |
Type |
Description
|
| stub |
ManagementAPIsServiceStub |
gRPC stub for making management calls
|
infer()
| Name |
Type |
Description
|
| (none) |
None |
Prints the prediction result to stdout; exits with code 1 on gRPC error
|
ping()
| Name |
Type |
Description
|
| (none) |
None |
Prints the health response to stdout; exits with code 1 on gRPC error
|
Usage Examples
Basic Usage
from torchserve_grpc_client import get_inference_stub, get_management_stub, infer, ping, register
# Create stubs
inference_stub = get_inference_stub("localhost", 80, "torchserve.default.example.com")
management_stub = get_management_stub("localhost", 80, "torchserve.default.example.com")
# Health check
ping(inference_stub)
# Register a model
register(management_stub, "mnist", None)
# Run inference
infer(inference_stub, "mnist", "test_data/mnist_input.json")
CLI Usage
# Run inference via command line:
# python torchserve_grpc_client.py \
# --host localhost \
# --port 80 \
# --hostname torchserve.default.example.com \
# --model mnist \
# --api_name infer \
# --input_path mnist.json
# Run health check:
# python torchserve_grpc_client.py \
# --host localhost \
# --port 80 \
# --hostname torchserve.default.example.com \
# --api_name ping
Related Pages