Implementation:Predibase Lorax LoRA Weights Load

Knowledge Sources	LoRAX LoRA
Domains	Parameter_Efficient_Finetuning, Model_Serving
Last Updated	2026-02-08 02:00 GMT

Overview

Concrete tool for loading LoRA adapter weights into GPU memory provided by the LoraConfig and LoraWeights classes.

Description

LoraConfig loads adapter configuration from HuggingFace (via peft.LoraConfig) and maps weight names to module positions. LoraWeights.load() iterates over model layers, loads the LoRA A and B matrices from safetensors, applies scaling, transposes for kernel compatibility, and pads ranks for SGMV/BGMV kernel requirements.

Usage

Called internally when a new adapter is requested. Not called directly by end users. The adapter loading pipeline in the router triggers this via gRPC.

Code Reference

Source Location

Repository: LoRAX
File: server/lorax_server/adapters/lora.py
Lines: 29-210

Signature

@dataclass
class LoraConfig(AdapterConfig):
    r: int
    target_modules: Optional[Union[List[str], str]]
    fan_in_fan_out: bool
    lora_alpha: int
    use_rslora: bool

    @classmethod
    def load(cls, adapter_id: str, api_token: str) -> "LoraConfig":
        """Load LoRA config from HuggingFace Hub."""

class LoraWeights(AdapterWeights):
    def __init__(
        self,
        weights_a: List[torch.Tensor],
        weights_b: List[torch.Tensor],
        adapter_config: LoraConfig,
    ):

    @classmethod
    def load(
        cls,
        config: LoraConfig,
        model: "Model",
        module_map: Dict[str, Dict],
        layer_type: str,
        unused_weight_names: Set[str],
    ) -> Optional[AdapterWeights]:
        """Load LoRA weights for all layers of a given type."""

Import

from lorax_server.adapters.lora import LoraConfig, LoraWeights

I/O Contract

Inputs

Name	Type	Required	Description
adapter_id	str	Yes	HuggingFace adapter ID or local path
api_token	str	No	Auth token for private adapters
config	LoraConfig	Yes	Adapter config (rank, target_modules, alpha)
model	Model	Yes	Base model instance for layer mapping
module_map	Dict[str, Dict]	Yes	Weight name to tensor mapping
layer_type	str	Yes	Layer type to load (e.g., "q_proj")

Outputs

Name	Type	Description
weights	Optional[LoraWeights]	Stacked LoRA A/B tensors, or None if no weights for layer type

Usage Examples

Internal Adapter Loading

# Called internally during adapter resolution
config = LoraConfig.load("my-org/my-adapter", api_token="hf_xxx")
# config.r = 16, config.lora_alpha = 32, config.target_modules = ["q_proj", "v_proj"]

# Load weights for each layer type
weights = LoraWeights.load(
    config=config,
    model=model,
    module_map=module_map,
    layer_type="q_proj",
    unused_weight_names=unused,
)
# weights.weights_a shape: [num_layers, hidden_size, r]
# weights.weights_b shape: [num_layers, r, hidden_size]

Related Pages

Implements Principle

Principle:Predibase_Lorax_Dynamic_LoRA_Loading

Requires Environment

Uses Heuristic

Heuristic:Predibase_Lorax_LoRA_Kernel_Selection_By_Rank

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment