Implementation:AUTOMATIC1111 Stable diffusion webui Hypernetwork and HypernetworkModule

Knowledge Sources	stable-diffusion-webui
Domains	Deep Learning, Stable Diffusion, Cross-Attention Modification
Last Updated	2026-02-08 00:00 GMT

Overview

Concrete implementation of hypernetwork architecture for cross-attention modification in Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. The Hypernetwork class manages paired MLP modules per attention dimension, while HypernetworkModule implements the individual residual MLP that transforms K or V context tensors.

Description

The Hypernetwork class is the top-level container that holds a dictionary of paired HypernetworkModule instances keyed by attention dimension (e.g., 320, 640, 768, 1280). Each pair consists of one module for transforming K projections and one for V projections. The class provides methods for saving/loading state, switching between train/eval modes, and controlling the multiplier strength.

The HypernetworkModule class (a torch.nn.Module subclass) builds a sequential MLP from a configurable layer structure, applies the selected activation function between layers, and computes the residual forward pass: x + linear(x) * multiplier.

Usage

Import and instantiate these classes when creating, loading, or training hypernetworks for Stable Diffusion. The Hypernetwork class is the primary interface for the training pipeline and inference hooks.

Code Reference

Source Location

Repository: stable-diffusion-webui
File: modules/hypernetworks/hypernetwork.py
Lines: L25-126 (HypernetworkModule), L144-309 (Hypernetwork)

Signature

class HypernetworkModule(torch.nn.Module):
    def __init__(self, dim, state_dict=None, layer_structure=None, activation_func=None,
                 weight_init='Normal', add_layer_norm=False, activate_output=False,
                 dropout_structure=None):

class Hypernetwork:
    def __init__(self, name=None, enable_sizes=None, layer_structure=None,
                 activation_func=None, weight_init=None, add_layer_norm=False,
                 use_dropout=False, activate_output=False, **kwargs):

Import

from modules.hypernetworks.hypernetwork import Hypernetwork, HypernetworkModule

I/O Contract

Inputs (HypernetworkModule.init)

Name	Type	Required	Description
dim	int	Yes	Input/output dimension of the attention context (e.g., 768)
state_dict	dict or None	No	Pre-trained state dict to load; if None, weights are initialized from scratch
layer_structure	list[float]	Yes	Multiplier sequence for layer widths (e.g., [1, 2, 1]); must start and end with 1
activation_func	str or None	No	Activation function name: "linear", "relu", "leakyrelu", "elu", "swish", "tanh", "sigmoid", or any torch.nn activation
weight_init	str	No	Weight initialization strategy: "Normal", "XavierUniform", "XavierNormal", "KaimingUniform", "KaimingNormal" (default: "Normal")
add_layer_norm	bool	No	Whether to add LayerNorm after each linear layer (default: False)
activate_output	bool	No	Whether to apply activation function to the last layer (default: False)
dropout_structure	list[float] or None	No	Per-layer dropout probabilities; values must be in [0, 1)

Inputs (Hypernetwork.init)

Name	Type	Required	Description
name	str or None	No	Name identifier for the hypernetwork
enable_sizes	list[int] or None	No	List of attention dimensions to create modules for (e.g., [320, 640, 768, 1280])
layer_structure	list[float] or None	No	Multiplier sequence passed to each HypernetworkModule
activation_func	str or None	No	Activation function name passed to each HypernetworkModule
weight_init	str or None	No	Weight initialization strategy passed to each HypernetworkModule
add_layer_norm	bool	No	Whether to add LayerNorm (default: False)
use_dropout	bool	No	Whether to enable dropout (default: False)
activate_output	bool	No	Whether to activate the last layer (default: False)
**kwargs	dict	No	Additional options: last_layer_dropout (bool), dropout_structure (list)

Outputs

Name	Type	Description
HypernetworkModule.forward(x)	Tensor	Transformed tensor: `x + self.linear(x) * multiplier`, same shape as input
Hypernetwork.weights()	list[Parameter]	All trainable parameters across all paired modules
Hypernetwork.layers	dict[int, tuple[HypernetworkModule, HypernetworkModule]]	Dictionary mapping dimension to (K_module, V_module) pair

Usage Examples

Creating a New Hypernetwork

from modules.hypernetworks.hypernetwork import Hypernetwork

# Create a hypernetwork with modules for standard SD attention dimensions
hypernet = Hypernetwork(
    name="my_style",
    enable_sizes=[320, 640, 768, 1280],
    layer_structure=[1, 2, 1],
    activation_func="relu",
    weight_init="Normal",
    add_layer_norm=False,
    use_dropout=False,
    activate_output=False,
)

# Access the paired modules for dimension 768
k_module, v_module = hypernet.layers[768]

HypernetworkModule Forward Pass

import torch
from modules.hypernetworks.hypernetwork import HypernetworkModule

# Create a single module for dimension 768
module = HypernetworkModule(
    dim=768,
    layer_structure=[1, 2, 1],
    activation_func="relu",
    weight_init="Normal",
)

# Forward pass applies residual: x + linear(x) * multiplier
context = torch.randn(1, 77, 768)
transformed = module(context)  # shape: (1, 77, 768)

Loading an Existing Hypernetwork

from modules.hypernetworks.hypernetwork import Hypernetwork

hypernet = Hypernetwork()
hypernet.load("/path/to/my_style.pt")
# hypernet.layers now contains the loaded modules
# hypernet.name, hypernet.step, etc. are restored from the checkpoint

Related Pages

Implements Principle

Principle:AUTOMATIC1111_Stable_diffusion_webui_Hypernetwork_architecture

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment