Implementation:Predibase Lorax LoRA Weights Load
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Parameter_Efficient_Finetuning, Model_Serving |
| Last Updated | 2026-02-08 02:00 GMT |
Overview
Concrete tool for loading LoRA adapter weights into GPU memory provided by the LoraConfig and LoraWeights classes.
Description
LoraConfig loads adapter configuration from HuggingFace (via peft.LoraConfig) and maps weight names to module positions. LoraWeights.load() iterates over model layers, loads the LoRA A and B matrices from safetensors, applies scaling, transposes for kernel compatibility, and pads ranks for SGMV/BGMV kernel requirements.
Usage
Called internally when a new adapter is requested. Not called directly by end users. The adapter loading pipeline in the router triggers this via gRPC.
Code Reference
Source Location
- Repository: LoRAX
- File: server/lorax_server/adapters/lora.py
- Lines: 29-210
Signature
@dataclass
class LoraConfig(AdapterConfig):
r: int
target_modules: Optional[Union[List[str], str]]
fan_in_fan_out: bool
lora_alpha: int
use_rslora: bool
@classmethod
def load(cls, adapter_id: str, api_token: str) -> "LoraConfig":
"""Load LoRA config from HuggingFace Hub."""
class LoraWeights(AdapterWeights):
def __init__(
self,
weights_a: List[torch.Tensor],
weights_b: List[torch.Tensor],
adapter_config: LoraConfig,
):
@classmethod
def load(
cls,
config: LoraConfig,
model: "Model",
module_map: Dict[str, Dict],
layer_type: str,
unused_weight_names: Set[str],
) -> Optional[AdapterWeights]:
"""Load LoRA weights for all layers of a given type."""
Import
from lorax_server.adapters.lora import LoraConfig, LoraWeights
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| adapter_id | str | Yes | HuggingFace adapter ID or local path |
| api_token | str | No | Auth token for private adapters |
| config | LoraConfig | Yes | Adapter config (rank, target_modules, alpha) |
| model | Model | Yes | Base model instance for layer mapping |
| module_map | Dict[str, Dict] | Yes | Weight name to tensor mapping |
| layer_type | str | Yes | Layer type to load (e.g., "q_proj") |
Outputs
| Name | Type | Description |
|---|---|---|
| weights | Optional[LoraWeights] | Stacked LoRA A/B tensors, or None if no weights for layer type |
Usage Examples
Internal Adapter Loading
# Called internally during adapter resolution
config = LoraConfig.load("my-org/my-adapter", api_token="hf_xxx")
# config.r = 16, config.lora_alpha = 32, config.target_modules = ["q_proj", "v_proj"]
# Load weights for each layer type
weights = LoraWeights.load(
config=config,
model=model,
module_map=module_map,
layer_type="q_proj",
unused_weight_names=unused,
)
# weights.weights_a shape: [num_layers, hidden_size, r]
# weights.weights_b shape: [num_layers, r, hidden_size]
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment