Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Triton inference server Server Container Health Check

From Leeroopedia
Field Value
Page Type Implementation
Title Container_Health_Check
Namespace Triton_inference_server_Server
Workflow Custom_Container_Build
Domains Quality_Assurance, Container_Build
Last Updated 2026-02-13 17:00 GMT

Overview

Concrete Docker run and curl-based verification procedure for custom Triton containers.

Description

Container health check verification involves launching the custom-built Triton container with Docker, exposing the standard service ports, and using HTTP requests to confirm the server is healthy and responsive. The procedure validates binary integrity, library linking, backend loading, and endpoint binding in a single operational test.

The verification process follows these steps:

  1. Launch the container using docker run with port mappings for HTTP (8000), gRPC (8001), and metrics (8002)
  2. Wait for server startup by monitoring container logs or polling the health endpoint
  3. Query the health endpoint at /v2/health/ready to confirm the server is ready to accept inference requests
  4. Inspect server logs to verify all requested backends loaded without errors
  5. Optionally query the metrics endpoint at port 8002 to confirm Prometheus metrics are available

The health endpoint follows the KServe V2 inference protocol specification, returning HTTP 200 when the server is ready and HTTP 503 when it is still initializing or unhealthy.

Usage

Basic Health Check

# Launch the container (with --rm for automatic cleanup)
docker run --rm -d \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  --name triton-verify \
  tritonserver \
  tritonserver --model-repository=/models

# Wait for the server to start (poll health endpoint)
for i in $(seq 1 30); do
  if curl -s -o /dev/null -w "%{http_code}" localhost:8000/v2/health/ready | grep -q "200"; then
    echo "Server is ready"
    break
  fi
  echo "Waiting for server... ($i/30)"
  sleep 2
done

# Verify health endpoint
curl -v localhost:8000/v2/health/ready

# Check server metadata
curl -s localhost:8000/v2 | python3 -m json.tool

# Stop the verification container
docker stop triton-verify

GPU-Enabled Health Check

# Launch with GPU access
docker run --rm -d \
  --gpus all \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  --name triton-verify \
  tritonserver \
  tritonserver --model-repository=/models

# Verify health
curl -v localhost:8000/v2/health/ready

Code Reference

Source Location

File Lines Description
src/main.cc L439-511 main() -- Server entry point that initializes the server and starts all endpoints
src/main.cc L224-300 StartEndpoints() -- Initializes and starts HTTP, gRPC, and metrics endpoints
src/http_server.cc L1355-1371 HandleServerHealth() -- HTTP handler for the /v2/health/ready and /v2/health/live endpoints

Signature

# Launch container
docker run --rm \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  <image> \
  tritonserver --model-repository=<path>

# Verify health
curl -v localhost:8000/v2/health/ready

Import

No code imports required. This procedure uses:

  • docker CLI for container management
  • curl for HTTP health checks
  • The tritonserver binary inside the container

I/O Contract

Inputs

Input Type Description
Docker image Container image The custom-built Triton container to verify (e.g., tritonserver or custom name)
Model repository Directory/Path Path to a model repository (can be empty for basic health check; server starts in READY state with --model-control-mode=explicit)
Port mappings Network config Standard Triton ports: 8000 (HTTP), 8001 (gRPC), 8002 (metrics)

Outputs

Output Type Description
HTTP 200 from health endpoint HTTP response Confirms the server started successfully and is ready to accept requests
gRPC endpoint active Network service gRPC server listening on port 8001
Metrics endpoint active Network service Prometheus metrics available on port 8002 at /metrics
Server startup logs stdout/stderr Log output showing backend loading status and endpoint initialization

Verification Endpoints

Endpoint Port Path Expected Response
HTTP Health (ready) 8000 /v2/health/ready HTTP 200 when server is ready
HTTP Health (live) 8000 /v2/health/live HTTP 200 when server process is alive
HTTP Server Metadata 8000 /v2 JSON with server name, version, and extensions
Prometheus Metrics 8002 /metrics Prometheus text format with server metrics
gRPC Health 8001 gRPC health check gRPC SERVING status

Usage Examples

Example 1: Quick smoke test after compose build

# After compose.py completes
docker run --rm -d \
  --gpus all \
  -p 8000:8000 \
  --name triton-test \
  tritonserver \
  tritonserver --model-repository=/models --model-control-mode=explicit

# Wait and check
sleep 10
curl -s localhost:8000/v2/health/ready
# Expected: HTTP 200

docker stop triton-test

Example 2: Detailed verification with log inspection

# Launch in foreground to see logs
docker run --rm \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  tritonserver \
  tritonserver --model-repository=/models --log-verbose=1

# In another terminal:
curl -v localhost:8000/v2/health/ready
curl -s localhost:8000/v2 | python3 -m json.tool
curl -s localhost:8002/metrics | head -20

Example 3: Automated CI verification script

#!/bin/bash
set -e

IMAGE="${1:-tritonserver}"
CONTAINER_NAME="triton-ci-verify"

# Launch container
docker run --rm -d \
  -p 8000:8000 \
  --name "$CONTAINER_NAME" \
  "$IMAGE" \
  tritonserver --model-repository=/models --model-control-mode=explicit

# Poll health endpoint with timeout
TIMEOUT=60
ELAPSED=0
while [ $ELAPSED -lt $TIMEOUT ]; do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" localhost:8000/v2/health/ready 2>/dev/null || echo "000")
  if [ "$STATUS" = "200" ]; then
    echo "PASS: Server is healthy"
    docker stop "$CONTAINER_NAME"
    exit 0
  fi
  sleep 2
  ELAPSED=$((ELAPSED + 2))
done

echo "FAIL: Server did not become healthy within ${TIMEOUT}s"
docker logs "$CONTAINER_NAME"
docker stop "$CONTAINER_NAME"
exit 1

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment