Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server HTTP Health Endpoint

From Leeroopedia
Revision as of 13:58, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/Triton_inference_server_Server_HTTP_Health_Endpoint.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Knowledge Sources
Domains MLOps, Observability, HTTP_API
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete HTTP endpoint handler for server liveness and readiness checks in Triton Inference Server.

Description

The HandleServerHealth method in HTTPAPIServer processes GET requests to the /v2/health/live and /v2/health/ready endpoints. It delegates to the TRITONSERVER C API functions TRITONSERVER_ServerIsLive and TRITONSERVER_ServerIsReady to determine the server's state, returning HTTP 200 for healthy or HTTP 400 for not ready.

Usage

Call these endpoints using curl or any HTTP client immediately after starting Triton to verify the server is ready. Use as Kubernetes liveness and readiness probes in production deployments.

Code Reference

Source Location

Signature

// src/http_server.cc:L1355
void
HTTPAPIServer::HandleServerHealth(evhtp_request_t* req, const std::string& kind)
{
    // kind == "live"  → TRITONSERVER_ServerIsLive(server_.get(), &ready)
    // kind == "ready" → TRITONSERVER_ServerIsReady(server_.get(), &ready)
    // Returns: HTTP 200 (ready=true) or HTTP 400 (ready=false)
}
# HTTP API
GET /v2/health/live     # Server liveness
GET /v2/health/ready    # Server readiness
GET /v2/models/<model_name>/ready  # Model-specific readiness

Import

# Client-side usage (no import needed, standard HTTP):
curl -v localhost:8000/v2/health/ready

I/O Contract

Inputs

Name Type Required Description
kind string (URL path) Yes "live" or "ready" (from URL path)
model_name string (URL path) No For model-specific readiness: /v2/models/<name>/ready

Outputs

Name Type Description
HTTP status int 200 (healthy/ready) or 400 (not ready)
response body empty No body content for health endpoints

Usage Examples

Basic Health Check

# Check server readiness
curl -v localhost:8000/v2/health/ready
# HTTP/1.1 200 OK  →  Server is ready

# Check server liveness
curl -v localhost:8000/v2/health/live
# HTTP/1.1 200 OK  →  Server process is running

Model-Specific Readiness

# Check if a specific model is loaded and ready
curl -v localhost:8000/v2/models/densenet_onnx/ready
# HTTP/1.1 200 OK  →  Model is ready for inference

Kubernetes Probe Configuration

# In Kubernetes Deployment spec
livenessProbe:
  httpGet:
    path: /v2/health/live
    port: 8000
  initialDelaySeconds: 10
  periodSeconds: 5
readinessProbe:
  httpGet:
    path: /v2/health/ready
    port: 8000
  initialDelaySeconds: 30
  periodSeconds: 10

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment