Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Sgl project Sglang Health And Metrics Endpoints

From Leeroopedia


Knowledge Sources
Domains LLM_Serving, Monitoring, Operations
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for checking server health and collecting operational metrics from a running SGLang server.

Description

SGLang's HTTP server exposes multiple monitoring endpoints: /health returns a simple status code (200 or 503), /server_info returns a JSON object with model information and internal state, and /metrics (when --enable-metrics is set) returns Prometheus-format metrics text. These are standard FastAPI GET endpoints.

Usage

Use /health for load balancer health checks. Use /server_info for debugging model configuration. Use /metrics with Prometheus + Grafana for production monitoring dashboards.

Code Reference

Source Location

  • Repository: sglang
  • File: python/sglang/srt/entrypoints/http_server.py
  • Lines: L453-454 (/health), L582-599 (/server_info)

Signature

# Health check endpoint
@app.get("/health")
async def health() -> Response:
    """Returns 200 if healthy, 503 if starting/shutting down."""

# Server info endpoint
@app.get("/server_info")
async def server_info() -> Dict:
    """Returns JSON with model info, version, and internal state."""

# Prometheus metrics (requires --enable-metrics)
@app.get("/metrics")
async def metrics() -> Response:
    """Returns Prometheus-format metrics text."""

I/O Contract

Inputs

Name Type Required Description
(none) HTTP GET Yes Simple GET requests to the respective endpoints

Outputs

Name Type Description
/health HTTP 200 or 503 Health status
/server_info JSON Dict Model info, version, configuration, internal states
/metrics Prometheus text Time-series metrics (request count, latency, tokens/sec)

Usage Examples

Health Check

# Simple health check
curl http://localhost:30000/health
# Returns empty 200 OK if healthy

# Use in health check scripts
if curl -sf http://localhost:30000/health; then
    echo "Server is healthy"
else
    echo "Server is not ready"
fi

Server Info

import requests

info = requests.get("http://localhost:30000/server_info").json()
print(f"Model: {info.get('model_path')}")
print(f"Version: {info.get('version')}")

Prometheus Metrics

# Launch server with metrics enabled
python -m sglang.launch_server --model-path ... --enable-metrics

# Fetch Prometheus metrics
curl http://localhost:30000/metrics

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment