Implementation:Triton inference server Server Container Health Check

Field	Value
Page Type	Implementation
Title	Container_Health_Check
Namespace	Triton_inference_server_Server
Workflow	Custom_Container_Build
Domains	Quality_Assurance, Container_Build
Last Updated	2026-02-13 17:00 GMT

Overview

Concrete Docker run and curl-based verification procedure for custom Triton containers.

Description

Container health check verification involves launching the custom-built Triton container with Docker, exposing the standard service ports, and using HTTP requests to confirm the server is healthy and responsive. The procedure validates binary integrity, library linking, backend loading, and endpoint binding in a single operational test.

The verification process follows these steps:

Launch the container using docker run with port mappings for HTTP (8000), gRPC (8001), and metrics (8002)
Wait for server startup by monitoring container logs or polling the health endpoint
Query the health endpoint at /v2/health/ready to confirm the server is ready to accept inference requests
Inspect server logs to verify all requested backends loaded without errors
Optionally query the metrics endpoint at port 8002 to confirm Prometheus metrics are available

The health endpoint follows the KServe V2 inference protocol specification, returning HTTP 200 when the server is ready and HTTP 503 when it is still initializing or unhealthy.

Usage

Basic Health Check

# Launch the container (with --rm for automatic cleanup)
docker run --rm -d \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  --name triton-verify \
  tritonserver \
  tritonserver --model-repository=/models

# Wait for the server to start (poll health endpoint)
for i in $(seq 1 30); do
  if curl -s -o /dev/null -w "%{http_code}" localhost:8000/v2/health/ready | grep -q "200"; then
    echo "Server is ready"
    break
  fi
  echo "Waiting for server... ($i/30)"
  sleep 2
done

# Verify health endpoint
curl -v localhost:8000/v2/health/ready

# Check server metadata
curl -s localhost:8000/v2 | python3 -m json.tool

# Stop the verification container
docker stop triton-verify

GPU-Enabled Health Check

# Launch with GPU access
docker run --rm -d \
  --gpus all \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  --name triton-verify \
  tritonserver \
  tritonserver --model-repository=/models

# Verify health
curl -v localhost:8000/v2/health/ready

Code Reference

Source Location

File	Lines	Description
`src/main.cc`	L439-511	`main()` -- Server entry point that initializes the server and starts all endpoints
`src/main.cc`	L224-300	`StartEndpoints()` -- Initializes and starts HTTP, gRPC, and metrics endpoints
`src/http_server.cc`	L1355-1371	`HandleServerHealth()` -- HTTP handler for the `/v2/health/ready` and `/v2/health/live` endpoints

Signature

# Launch container
docker run --rm \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  <image> \
  tritonserver --model-repository=<path>

# Verify health
curl -v localhost:8000/v2/health/ready

Import

No code imports required. This procedure uses:

docker CLI for container management
curl for HTTP health checks
The tritonserver binary inside the container

I/O Contract

Inputs

Input	Type	Description
Docker image	Container image	The custom-built Triton container to verify (e.g., `tritonserver` or custom name)
Model repository	Directory/Path	Path to a model repository (can be empty for basic health check; server starts in READY state with `--model-control-mode=explicit`)
Port mappings	Network config	Standard Triton ports: 8000 (HTTP), 8001 (gRPC), 8002 (metrics)

Outputs

Output	Type	Description
HTTP 200 from health endpoint	HTTP response	Confirms the server started successfully and is ready to accept requests
gRPC endpoint active	Network service	gRPC server listening on port 8001
Metrics endpoint active	Network service	Prometheus metrics available on port 8002 at `/metrics`
Server startup logs	stdout/stderr	Log output showing backend loading status and endpoint initialization

Verification Endpoints

Endpoint	Port	Path	Expected Response
HTTP Health (ready)	8000	`/v2/health/ready`	HTTP 200 when server is ready
HTTP Health (live)	8000	`/v2/health/live`	HTTP 200 when server process is alive
HTTP Server Metadata	8000	`/v2`	JSON with server name, version, and extensions
Prometheus Metrics	8002	`/metrics`	Prometheus text format with server metrics
gRPC Health	8001	gRPC health check	gRPC SERVING status

Usage Examples

Example 1: Quick smoke test after compose build

# After compose.py completes
docker run --rm -d \
  --gpus all \
  -p 8000:8000 \
  --name triton-test \
  tritonserver \
  tritonserver --model-repository=/models --model-control-mode=explicit

# Wait and check
sleep 10
curl -s localhost:8000/v2/health/ready
# Expected: HTTP 200

docker stop triton-test

Example 2: Detailed verification with log inspection

# Launch in foreground to see logs
docker run --rm \
  -p 8000:8000 \
  -p 8001:8001 \
  -p 8002:8002 \
  tritonserver \
  tritonserver --model-repository=/models --log-verbose=1

# In another terminal:
curl -v localhost:8000/v2/health/ready
curl -s localhost:8000/v2 | python3 -m json.tool
curl -s localhost:8002/metrics | head -20

Example 3: Automated CI verification script

#!/bin/bash
set -e

IMAGE="${1:-tritonserver}"
CONTAINER_NAME="triton-ci-verify"

# Launch container
docker run --rm -d \
  -p 8000:8000 \
  --name "$CONTAINER_NAME" \
  "$IMAGE" \
  tritonserver --model-repository=/models --model-control-mode=explicit

# Poll health endpoint with timeout
TIMEOUT=60
ELAPSED=0
while [ $ELAPSED -lt $TIMEOUT ]; do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" localhost:8000/v2/health/ready 2>/dev/null || echo "000")
  if [ "$STATUS" = "200" ]; then
    echo "PASS: Server is healthy"
    docker stop "$CONTAINER_NAME"
    exit 0
  fi
  sleep 2
  ELAPSED=$((ELAPSED + 2))
done

echo "FAIL: Server did not become healthy within ${TIMEOUT}s"
docker logs "$CONTAINER_NAME"
docker stop "$CONTAINER_NAME"
exit 1

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment