Implementation:Sgl project Sglang Get Model Loader
| Knowledge Sources | |
|---|---|
| Domains | Quantization, Model_Loading, Model_Optimization |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Concrete tool for selecting and executing the appropriate model loader for standard or quantized model loading in SGLang.
Description
The get_model_loader factory function inspects LoadConfig and ModelConfig to determine the appropriate loader class. For ModelOpt quantization (modelopt_fp8, modelopt_fp4), it returns a ModelOptModelLoader that handles the complete quantize-export pipeline. The ModelOptModelLoader.load_model method loads the base model, applies NVIDIA ModelOpt quantization, and optionally exports the quantized result.
Usage
Call get_model_loader to obtain the correct loader, then call loader.load_model to execute loading and optional quantization. This is primarily used in standalone quantization scripts.
Code Reference
Source Location
- Repository: sglang
- File: python/sglang/srt/model_loader/loader.py
- Lines: L2713-2742 (get_model_loader), L2614-2637 (ModelOptModelLoader.load_model)
Signature
def get_model_loader(
load_config: LoadConfig,
model_config: Optional[ModelConfig] = None,
) -> BaseModelLoader:
"""Get a model loader based on the load format and quantization config."""
class ModelOptModelLoader(DefaultModelLoader):
def load_model(
self,
*,
model_config: ModelConfig,
device_config: DeviceConfig,
) -> nn.Module:
"""Load and optionally quantize model using NVIDIA ModelOpt."""
Import
from sglang.srt.model_loader.loader import get_model_loader
from sglang.srt.configs.load_config import LoadConfig
from sglang.srt.configs.model_config import ModelConfig
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| load_config | LoadConfig | Yes | Loading format and export path configuration |
| model_config | Optional[ModelConfig] | No | Model configuration with quantization settings |
| device_config | DeviceConfig | Yes (for load_model) | Target device for model placement |
Outputs
| Name | Type | Description |
|---|---|---|
| model_loader | BaseModelLoader | Selected model loader instance |
| model | nn.Module | Loaded (and optionally quantized) model (from load_model) |
Usage Examples
Quantize and Export
from sglang.srt.model_loader.loader import get_model_loader
from sglang.srt.configs.model_config import ModelConfig
from sglang.srt.configs.load_config import LoadConfig
model_config = ModelConfig(
model_path="meta-llama/Llama-3.1-8B-Instruct",
quantization="modelopt_fp8",
)
load_config = LoadConfig(
modelopt_export_path="/tmp/quantized_model",
)
# Factory selects ModelOptModelLoader
loader = get_model_loader(load_config, model_config)
# Load, quantize, and export
model = loader.load_model(
model_config=model_config,
device_config=device_config,
)