Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sgl project Sglang Init Distributed Environment

From Leeroopedia


Knowledge Sources
Domains Distributed_Computing, GPU_Parallelism
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for initializing PyTorch distributed process groups for multi-GPU model serving and quantization.

Description

The init_distributed_environment function initializes torch.distributed with the specified backend, world size, and rank. It creates the global process group and sets the local rank for GPU assignment. It supports NCCL (GPU), Gloo (CPU), and Mooncake backends. If torch.distributed is already initialized, it validates the existing configuration.

Usage

Call init_distributed_environment at the beginning of standalone multi-GPU scripts (e.g., ModelOpt quantization). For standard Engine/Server usage, this is called automatically.

Code Reference

Source Location

  • Repository: sglang
  • File: python/sglang/srt/distributed/parallel_state.py
  • Lines: L1491-1555

Signature

def init_distributed_environment(
    world_size: int = -1,
    rank: int = -1,
    distributed_init_method: str = "env://",
    local_rank: int = -1,
    backend: str = "nccl",
    timeout: Optional[int] = None,
) -> None:
    """Initialize distributed environment for multi-GPU execution."""

Import

from sglang.srt.distributed.parallel_state import (
    init_distributed_environment,
    initialize_model_parallel,
)

I/O Contract

Inputs

Name Type Required Description
world_size int No Total process count (-1 = auto from env)
rank int No Process rank (-1 = auto from env)
distributed_init_method str No Init method (default: "env://")
local_rank int No Local GPU rank (-1 = auto)
backend str No Communication backend (default: "nccl")
timeout Optional[int] No Timeout in seconds

Outputs

Name Type Description
(none) None Initialized torch.distributed process group; global _WORLD set

Usage Examples

ModelOpt Quantization Script

import torch
from sglang.srt.distributed.parallel_state import (
    init_distributed_environment,
    initialize_model_parallel,
)

# Initialize for single-GPU quantization
init_distributed_environment(
    world_size=1,
    rank=0,
    distributed_init_method="tcp://127.0.0.1:12345",
    local_rank=0,
    backend="nccl",
)
initialize_model_parallel(tensor_model_parallel_size=1)

# Now proceed with model loading and quantization...

Multi-GPU Setup

# Typically launched via torchrun:
# torchrun --nproc_per_node=4 script.py

init_distributed_environment(
    world_size=4,
    rank=int(os.environ["RANK"]),
    local_rank=int(os.environ["LOCAL_RANK"]),
    backend="nccl",
)
initialize_model_parallel(tensor_model_parallel_size=4)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment