Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:AUTOMATIC1111 Stable diffusion webui Network loading

From Leeroopedia


Knowledge Sources
Domains Stable Diffusion, LoRA, LyCORIS, Model Loading, Low-Rank Adaptation
Last Updated 2026-02-08 00:00 GMT

Overview

Network loading is the process of reading LoRA and LyCORIS network adapter files from disk, matching their weight keys to layers in the active Stable Diffusion model, and constructing typed network module instances that can compute weight modifications at inference time.

Description

LoRA and LyCORIS adapters store learned weight deltas in safetensors (or checkpoint) files. These files contain tensors keyed by a naming convention that encodes the target model layer and the role of each tensor within the low-rank decomposition. The network loading system must:

Read the state dict: Load all tensors from the file into memory using the standard state dict reader.

Resolve key mappings: Network files may use diffusers-style keys (e.g., lora_unet_down_blocks_0_attentions_0_...) or CompVis-style keys (e.g., diffusion_model_input_blocks_1_1_...). A key conversion function translates diffusers names to CompVis names so they can be matched against the model's network_layer_mapping.

Match to model layers: Each converted key is looked up in the model's layer mapping to find the corresponding torch.nn.Module. Special handling exists for SDXL models, MultiheadAttention q/k/v projections, and OFT-style keys.

Create typed modules: The matched weights are passed through a chain of module type factories. Each factory inspects the weight keys to determine if they match its expected pattern (e.g., lora_up/lora_down for standard LoRA, hada_w1_a/hada_w1_b/hada_w2_a/hada_w2_b for LoHa). The first factory that claims the weights creates the corresponding NetworkModule instance.

Handle bundled embeddings: Some network files bundle textual inversion embeddings (keyed under bundle_emb), which are extracted and registered with the embedding database.

Usage

Network loading is triggered whenever the set of active LoRA networks changes, which happens on every generation run if the prompt's <lora:...> tags differ from the previously loaded set. Networks are cached in memory (subject to a configurable limit) to avoid redundant file reads for frequently used networks.

Theoretical Basis

Low-Rank Adaptation Mathematics

The fundamental idea behind LoRA is to represent a weight update as a low-rank product:

W' = W + DeltaW = W + B * A

where:

  • W is the original pretrained weight matrix of shape (d, k)
  • A is a learned down-projection matrix of shape (r, k)
  • B is a learned up-projection matrix of shape (d, r)
  • r is the rank, typically much smaller than d and k (e.g., r=4 to r=128)

The weight delta is scaled by alpha/r before application, where alpha is a learned or configured scaling factor.

Network Module Types

The system supports multiple low-rank decomposition architectures:

Type Key Pattern Mathematics Description
LoRA lora_up, lora_down DeltaW = up * down * scale Standard low-rank adaptation
LoHa hada_w1_a, hada_w1_b, hada_w2_a, hada_w2_b DeltaW = (w1_a * w1_b) . (w2_a * w2_b) * scale Hadamard product of two low-rank matrices
LoKr lokr_w1, lokr_w2 (optional _a, _b) DeltaW = kron(w1, w2) * scale Kronecker product decomposition
IA3 weight DeltaW = diag(weight) Learned element-wise rescaling
GLoRA a1, a2, b1, b2, alpha, dora_scale DeltaW = (a1 * weight + a2) * (b1 + b2) Generalized LoRA
OFT oft_blocks, oft_diag Applies orthogonal finetuning Orthogonal finetuning transformation
Full diff, diff_b DeltaW = diff Full-rank weight difference
Norm w_norm, b_norm Modifies normalization layers Normalization parameter adaptation

Key Conversion

The diffusers-to-CompVis key conversion handles the structural difference between Hugging Face diffusers naming (using down_blocks, up_blocks, mid_block, attentions, resnets) and CompVis naming (using input_blocks, output_blocks, middle_block with numeric indices). The conversion computes block indices from the diffusers naming components:

diffusers: lora_unet_down_blocks_{block}_{type}_{layer}_{suffix}
compvis:   diffusion_model_input_blocks_{1 + block*3 + layer}_{1 if attention else 0}_{suffix}

Caching Strategy

Networks are cached in an ordered dictionary (networks_in_memory) with a configurable size limit. When the cache exceeds its limit, the oldest entries are evicted. File modification time is checked against cached network mtime to detect changes.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment