Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Sgl project Sglang Get Model Loader

From Leeroopedia


Knowledge Sources
Domains Quantization, Model_Loading, Model_Optimization
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for selecting and executing the appropriate model loader for standard or quantized model loading in SGLang.

Description

The get_model_loader factory function inspects LoadConfig and ModelConfig to determine the appropriate loader class. For ModelOpt quantization (modelopt_fp8, modelopt_fp4), it returns a ModelOptModelLoader that handles the complete quantize-export pipeline. The ModelOptModelLoader.load_model method loads the base model, applies NVIDIA ModelOpt quantization, and optionally exports the quantized result.

Usage

Call get_model_loader to obtain the correct loader, then call loader.load_model to execute loading and optional quantization. This is primarily used in standalone quantization scripts.

Code Reference

Source Location

  • Repository: sglang
  • File: python/sglang/srt/model_loader/loader.py
  • Lines: L2713-2742 (get_model_loader), L2614-2637 (ModelOptModelLoader.load_model)

Signature

def get_model_loader(
    load_config: LoadConfig,
    model_config: Optional[ModelConfig] = None,
) -> BaseModelLoader:
    """Get a model loader based on the load format and quantization config."""

class ModelOptModelLoader(DefaultModelLoader):
    def load_model(
        self,
        *,
        model_config: ModelConfig,
        device_config: DeviceConfig,
    ) -> nn.Module:
        """Load and optionally quantize model using NVIDIA ModelOpt."""

Import

from sglang.srt.model_loader.loader import get_model_loader
from sglang.srt.configs.load_config import LoadConfig
from sglang.srt.configs.model_config import ModelConfig

I/O Contract

Inputs

Name Type Required Description
load_config LoadConfig Yes Loading format and export path configuration
model_config Optional[ModelConfig] No Model configuration with quantization settings
device_config DeviceConfig Yes (for load_model) Target device for model placement

Outputs

Name Type Description
model_loader BaseModelLoader Selected model loader instance
model nn.Module Loaded (and optionally quantized) model (from load_model)

Usage Examples

Quantize and Export

from sglang.srt.model_loader.loader import get_model_loader
from sglang.srt.configs.model_config import ModelConfig
from sglang.srt.configs.load_config import LoadConfig

model_config = ModelConfig(
    model_path="meta-llama/Llama-3.1-8B-Instruct",
    quantization="modelopt_fp8",
)
load_config = LoadConfig(
    modelopt_export_path="/tmp/quantized_model",
)

# Factory selects ModelOptModelLoader
loader = get_model_loader(load_config, model_config)

# Load, quantize, and export
model = loader.load_model(
    model_config=model_config,
    device_config=device_config,
)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment