Implementation:AUTOMATIC1111 Stable diffusion webui Hypernetwork and HypernetworkModule
| Knowledge Sources | |
|---|---|
| Domains | Deep Learning, Stable Diffusion, Cross-Attention Modification |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete implementation of hypernetwork architecture for cross-attention modification in Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. The Hypernetwork class manages paired MLP modules per attention dimension, while HypernetworkModule implements the individual residual MLP that transforms K or V context tensors.
Description
The Hypernetwork class is the top-level container that holds a dictionary of paired HypernetworkModule instances keyed by attention dimension (e.g., 320, 640, 768, 1280). Each pair consists of one module for transforming K projections and one for V projections. The class provides methods for saving/loading state, switching between train/eval modes, and controlling the multiplier strength.
The HypernetworkModule class (a torch.nn.Module subclass) builds a sequential MLP from a configurable layer structure, applies the selected activation function between layers, and computes the residual forward pass: x + linear(x) * multiplier.
Usage
Import and instantiate these classes when creating, loading, or training hypernetworks for Stable Diffusion. The Hypernetwork class is the primary interface for the training pipeline and inference hooks.
Code Reference
Source Location
- Repository: stable-diffusion-webui
- File:
modules/hypernetworks/hypernetwork.py - Lines: L25-126 (HypernetworkModule), L144-309 (Hypernetwork)
Signature
class HypernetworkModule(torch.nn.Module):
def __init__(self, dim, state_dict=None, layer_structure=None, activation_func=None,
weight_init='Normal', add_layer_norm=False, activate_output=False,
dropout_structure=None):
class Hypernetwork:
def __init__(self, name=None, enable_sizes=None, layer_structure=None,
activation_func=None, weight_init=None, add_layer_norm=False,
use_dropout=False, activate_output=False, **kwargs):
Import
from modules.hypernetworks.hypernetwork import Hypernetwork, HypernetworkModule
I/O Contract
Inputs (HypernetworkModule.__init__)
| Name | Type | Required | Description |
|---|---|---|---|
| dim | int | Yes | Input/output dimension of the attention context (e.g., 768) |
| state_dict | dict or None | No | Pre-trained state dict to load; if None, weights are initialized from scratch |
| layer_structure | list[float] | Yes | Multiplier sequence for layer widths (e.g., [1, 2, 1]); must start and end with 1 |
| activation_func | str or None | No | Activation function name: "linear", "relu", "leakyrelu", "elu", "swish", "tanh", "sigmoid", or any torch.nn activation |
| weight_init | str | No | Weight initialization strategy: "Normal", "XavierUniform", "XavierNormal", "KaimingUniform", "KaimingNormal" (default: "Normal") |
| add_layer_norm | bool | No | Whether to add LayerNorm after each linear layer (default: False) |
| activate_output | bool | No | Whether to apply activation function to the last layer (default: False) |
| dropout_structure | list[float] or None | No | Per-layer dropout probabilities; values must be in [0, 1) |
Inputs (Hypernetwork.__init__)
| Name | Type | Required | Description |
|---|---|---|---|
| name | str or None | No | Name identifier for the hypernetwork |
| enable_sizes | list[int] or None | No | List of attention dimensions to create modules for (e.g., [320, 640, 768, 1280]) |
| layer_structure | list[float] or None | No | Multiplier sequence passed to each HypernetworkModule |
| activation_func | str or None | No | Activation function name passed to each HypernetworkModule |
| weight_init | str or None | No | Weight initialization strategy passed to each HypernetworkModule |
| add_layer_norm | bool | No | Whether to add LayerNorm (default: False) |
| use_dropout | bool | No | Whether to enable dropout (default: False) |
| activate_output | bool | No | Whether to activate the last layer (default: False) |
| **kwargs | dict | No | Additional options: last_layer_dropout (bool), dropout_structure (list) |
Outputs
| Name | Type | Description |
|---|---|---|
| HypernetworkModule.forward(x) | Tensor | Transformed tensor: x + self.linear(x) * multiplier, same shape as input
|
| Hypernetwork.weights() | list[Parameter] | All trainable parameters across all paired modules |
| Hypernetwork.layers | dict[int, tuple[HypernetworkModule, HypernetworkModule]] | Dictionary mapping dimension to (K_module, V_module) pair |
Usage Examples
Creating a New Hypernetwork
from modules.hypernetworks.hypernetwork import Hypernetwork
# Create a hypernetwork with modules for standard SD attention dimensions
hypernet = Hypernetwork(
name="my_style",
enable_sizes=[320, 640, 768, 1280],
layer_structure=[1, 2, 1],
activation_func="relu",
weight_init="Normal",
add_layer_norm=False,
use_dropout=False,
activate_output=False,
)
# Access the paired modules for dimension 768
k_module, v_module = hypernet.layers[768]
HypernetworkModule Forward Pass
import torch
from modules.hypernetworks.hypernetwork import HypernetworkModule
# Create a single module for dimension 768
module = HypernetworkModule(
dim=768,
layer_structure=[1, 2, 1],
activation_func="relu",
weight_init="Normal",
)
# Forward pass applies residual: x + linear(x) * multiplier
context = torch.randn(1, 77, 768)
transformed = module(context) # shape: (1, 77, 768)
Loading an Existing Hypernetwork
from modules.hypernetworks.hypernetwork import Hypernetwork
hypernet = Hypernetwork()
hypernet.load("/path/to/my_style.pt")
# hypernet.layers now contains the loaded modules
# hypernet.name, hypernet.step, etc. are restored from the checkpoint