Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Kserve Kserve TorchServe gRPC Client

From Leeroopedia
Knowledge Sources
Domains gRPC, TorchServe
Last Updated 2026-02-13 00:00 GMT

Overview

Concrete tool for interacting with TorchServe inference and management gRPC APIs for predictions, model registration, and model lifecycle management provided by the KServe sample code.

Description

This module implements a comprehensive gRPC client for TorchServe with the following functions:

  • get_inference_stub() -- Creates a gRPC channel to the inference API and returns an InferenceAPIsServiceStub for making prediction and health check calls.
  • get_management_stub() -- Creates a gRPC channel to the management API and returns a ManagementAPIsServiceStub for model registration and lifecycle operations.
  • infer() -- Reads a binary input file and sends a PredictionsRequest to the inference stub, printing the decoded prediction result.
  • ping() -- Sends a health check (TorchServeHealthResponse) via the inference stub and prints the status.
  • register() -- Registers a model with TorchServe by submitting a RegisterModelRequest with the MAR file URL, initial workers, and model name. Checks a provided set of available MAR files and falls back to the TorchServe S3 URL if not found locally.
  • unregister() -- Unregisters a model from TorchServe by name.

The __main__ block provides CLI argument parsing for host, port, hostname, model name, API name (infer or ping), and input path.

Usage

Use this script as a gRPC client to interact with TorchServe models deployed on KServe, supporting inference requests, health checks, model registration, and model unregistration.

Code Reference

Source Location

Signature

def get_inference_stub(host, port, hostname):
    ...

def get_management_stub(host, port, hostname):
    ...

def infer(stub, model_name, model_input):
    ...

def ping(stub):
    ...

def register(stub, model_name, mar_set_str):
    ...

def unregister(stub, model_name):
    ...

Import

from torchserve_grpc_client import get_inference_stub, infer, ping, register, unregister

I/O Contract

Inputs

get_inference_stub()

Name Type Required Description
host str Yes Ingress host name or IP address
port int Yes Ingress port number
hostname str Yes Service host name for gRPC SSL target name override

get_management_stub()

Name Type Required Description
host str Yes Ingress host name or IP address
port int Yes Ingress port number
hostname str Yes Service host name for gRPC SSL target name override

infer()

Name Type Required Description
stub InferenceAPIsServiceStub Yes The gRPC inference stub
model_name str Yes Name of the TorchServe model to query
model_input str Yes File path to the binary input data

ping()

Name Type Required Description
stub InferenceAPIsServiceStub Yes The gRPC inference stub for the health check

register()

Name Type Required Description
stub ManagementAPIsServiceStub Yes The gRPC management stub
model_name str Yes Name of the model to register
mar_set_str str No Comma-separated string of available MAR filenames

unregister()

Name Type Required Description
stub ManagementAPIsServiceStub Yes The gRPC management stub
model_name str Yes Name of the model to unregister

Outputs

get_inference_stub()

Name Type Description
stub InferenceAPIsServiceStub gRPC stub for making inference calls

get_management_stub()

Name Type Description
stub ManagementAPIsServiceStub gRPC stub for making management calls

infer()

Name Type Description
(none) None Prints the prediction result to stdout; exits with code 1 on gRPC error

ping()

Name Type Description
(none) None Prints the health response to stdout; exits with code 1 on gRPC error

Usage Examples

Basic Usage

from torchserve_grpc_client import get_inference_stub, get_management_stub, infer, ping, register

# Create stubs
inference_stub = get_inference_stub("localhost", 80, "torchserve.default.example.com")
management_stub = get_management_stub("localhost", 80, "torchserve.default.example.com")

# Health check
ping(inference_stub)

# Register a model
register(management_stub, "mnist", None)

# Run inference
infer(inference_stub, "mnist", "test_data/mnist_input.json")

CLI Usage

# Run inference via command line:
# python torchserve_grpc_client.py \
#     --host localhost \
#     --port 80 \
#     --hostname torchserve.default.example.com \
#     --model mnist \
#     --api_name infer \
#     --input_path mnist.json

# Run health check:
# python torchserve_grpc_client.py \
#     --host localhost \
#     --port 80 \
#     --hostname torchserve.default.example.com \
#     --api_name ping

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment