Implementation:Sgl project Sglang Health And Metrics Endpoints
| Knowledge Sources | |
|---|---|
| Domains | LLM_Serving, Monitoring, Operations |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for checking server health and collecting operational metrics from a running SGLang server.
Description
SGLang's HTTP server exposes multiple monitoring endpoints: /health returns a simple status code (200 or 503), /server_info returns a JSON object with model information and internal state, and /metrics (when --enable-metrics is set) returns Prometheus-format metrics text. These are standard FastAPI GET endpoints.
Usage
Use /health for load balancer health checks. Use /server_info for debugging model configuration. Use /metrics with Prometheus + Grafana for production monitoring dashboards.
Code Reference
Source Location
- Repository: sglang
- File: python/sglang/srt/entrypoints/http_server.py
- Lines: L453-454 (/health), L582-599 (/server_info)
Signature
# Health check endpoint
@app.get("/health")
async def health() -> Response:
"""Returns 200 if healthy, 503 if starting/shutting down."""
# Server info endpoint
@app.get("/server_info")
async def server_info() -> Dict:
"""Returns JSON with model info, version, and internal state."""
# Prometheus metrics (requires --enable-metrics)
@app.get("/metrics")
async def metrics() -> Response:
"""Returns Prometheus-format metrics text."""
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| (none) | HTTP GET | Yes | Simple GET requests to the respective endpoints |
Outputs
| Name | Type | Description |
|---|---|---|
| /health | HTTP 200 or 503 | Health status |
| /server_info | JSON Dict | Model info, version, configuration, internal states |
| /metrics | Prometheus text | Time-series metrics (request count, latency, tokens/sec) |
Usage Examples
Health Check
# Simple health check
curl http://localhost:30000/health
# Returns empty 200 OK if healthy
# Use in health check scripts
if curl -sf http://localhost:30000/health; then
echo "Server is healthy"
else
echo "Server is not ready"
fi
Server Info
import requests
info = requests.get("http://localhost:30000/server_info").json()
print(f"Model: {info.get('model_path')}")
print(f"Version: {info.get('version')}")
Prometheus Metrics
# Launch server with metrics enabled
python -m sglang.launch_server --model-path ... --enable-metrics
# Fetch Prometheus metrics
curl http://localhost:30000/metrics