Implementation:Sgl project Sglang Health And Metrics Endpoints

Knowledge Sources	SGLang
Domains	LLM_Serving, Monitoring, Operations
Last Updated	2026-02-10 00:00 GMT

Overview

Concrete tool for checking server health and collecting operational metrics from a running SGLang server.

Description

SGLang's HTTP server exposes multiple monitoring endpoints: /health returns a simple status code (200 or 503), /server_info returns a JSON object with model information and internal state, and /metrics (when --enable-metrics is set) returns Prometheus-format metrics text. These are standard FastAPI GET endpoints.

Usage

Use /health for load balancer health checks. Use /server_info for debugging model configuration. Use /metrics with Prometheus + Grafana for production monitoring dashboards.

Code Reference

Source Location

Repository: sglang
File: python/sglang/srt/entrypoints/http_server.py
Lines: L453-454 (/health), L582-599 (/server_info)

Signature

# Health check endpoint
@app.get("/health")
async def health() -> Response:
    """Returns 200 if healthy, 503 if starting/shutting down."""

# Server info endpoint
@app.get("/server_info")
async def server_info() -> Dict:
    """Returns JSON with model info, version, and internal state."""

# Prometheus metrics (requires --enable-metrics)
@app.get("/metrics")
async def metrics() -> Response:
    """Returns Prometheus-format metrics text."""

I/O Contract

Inputs

Name	Type	Required	Description
(none)	HTTP GET	Yes	Simple GET requests to the respective endpoints

Outputs

Name	Type	Description
/health	HTTP 200 or 503	Health status
/server_info	JSON Dict	Model info, version, configuration, internal states
/metrics	Prometheus text	Time-series metrics (request count, latency, tokens/sec)

Usage Examples

Health Check

# Simple health check
curl http://localhost:30000/health
# Returns empty 200 OK if healthy

# Use in health check scripts
if curl -sf http://localhost:30000/health; then
    echo "Server is healthy"
else
    echo "Server is not ready"
fi

Server Info

import requests

info = requests.get("http://localhost:30000/server_info").json()
print(f"Model: {info.get('model_path')}")
print(f"Version: {info.get('version')}")

Prometheus Metrics

# Launch server with metrics enabled
python -m sglang.launch_server --model-path ... --enable-metrics

# Fetch Prometheus metrics
curl http://localhost:30000/metrics

Related Pages

Implements Principle

Principle:Sgl_project_Sglang_Server_Health_Monitoring

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment