Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui Hypernetwork and HypernetworkModule

From Leeroopedia


Knowledge Sources
Domains Deep Learning, Stable Diffusion, Cross-Attention Modification
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete implementation of hypernetwork architecture for cross-attention modification in Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. The Hypernetwork class manages paired MLP modules per attention dimension, while HypernetworkModule implements the individual residual MLP that transforms K or V context tensors.

Description

The Hypernetwork class is the top-level container that holds a dictionary of paired HypernetworkModule instances keyed by attention dimension (e.g., 320, 640, 768, 1280). Each pair consists of one module for transforming K projections and one for V projections. The class provides methods for saving/loading state, switching between train/eval modes, and controlling the multiplier strength.

The HypernetworkModule class (a torch.nn.Module subclass) builds a sequential MLP from a configurable layer structure, applies the selected activation function between layers, and computes the residual forward pass: x + linear(x) * multiplier.

Usage

Import and instantiate these classes when creating, loading, or training hypernetworks for Stable Diffusion. The Hypernetwork class is the primary interface for the training pipeline and inference hooks.

Code Reference

Source Location

  • Repository: stable-diffusion-webui
  • File: modules/hypernetworks/hypernetwork.py
  • Lines: L25-126 (HypernetworkModule), L144-309 (Hypernetwork)

Signature

class HypernetworkModule(torch.nn.Module):
    def __init__(self, dim, state_dict=None, layer_structure=None, activation_func=None,
                 weight_init='Normal', add_layer_norm=False, activate_output=False,
                 dropout_structure=None):

class Hypernetwork:
    def __init__(self, name=None, enable_sizes=None, layer_structure=None,
                 activation_func=None, weight_init=None, add_layer_norm=False,
                 use_dropout=False, activate_output=False, **kwargs):

Import

from modules.hypernetworks.hypernetwork import Hypernetwork, HypernetworkModule

I/O Contract

Inputs (HypernetworkModule.__init__)

Name Type Required Description
dim int Yes Input/output dimension of the attention context (e.g., 768)
state_dict dict or None No Pre-trained state dict to load; if None, weights are initialized from scratch
layer_structure list[float] Yes Multiplier sequence for layer widths (e.g., [1, 2, 1]); must start and end with 1
activation_func str or None No Activation function name: "linear", "relu", "leakyrelu", "elu", "swish", "tanh", "sigmoid", or any torch.nn activation
weight_init str No Weight initialization strategy: "Normal", "XavierUniform", "XavierNormal", "KaimingUniform", "KaimingNormal" (default: "Normal")
add_layer_norm bool No Whether to add LayerNorm after each linear layer (default: False)
activate_output bool No Whether to apply activation function to the last layer (default: False)
dropout_structure list[float] or None No Per-layer dropout probabilities; values must be in [0, 1)

Inputs (Hypernetwork.__init__)

Name Type Required Description
name str or None No Name identifier for the hypernetwork
enable_sizes list[int] or None No List of attention dimensions to create modules for (e.g., [320, 640, 768, 1280])
layer_structure list[float] or None No Multiplier sequence passed to each HypernetworkModule
activation_func str or None No Activation function name passed to each HypernetworkModule
weight_init str or None No Weight initialization strategy passed to each HypernetworkModule
add_layer_norm bool No Whether to add LayerNorm (default: False)
use_dropout bool No Whether to enable dropout (default: False)
activate_output bool No Whether to activate the last layer (default: False)
**kwargs dict No Additional options: last_layer_dropout (bool), dropout_structure (list)

Outputs

Name Type Description
HypernetworkModule.forward(x) Tensor Transformed tensor: x + self.linear(x) * multiplier, same shape as input
Hypernetwork.weights() list[Parameter] All trainable parameters across all paired modules
Hypernetwork.layers dict[int, tuple[HypernetworkModule, HypernetworkModule]] Dictionary mapping dimension to (K_module, V_module) pair

Usage Examples

Creating a New Hypernetwork

from modules.hypernetworks.hypernetwork import Hypernetwork

# Create a hypernetwork with modules for standard SD attention dimensions
hypernet = Hypernetwork(
    name="my_style",
    enable_sizes=[320, 640, 768, 1280],
    layer_structure=[1, 2, 1],
    activation_func="relu",
    weight_init="Normal",
    add_layer_norm=False,
    use_dropout=False,
    activate_output=False,
)

# Access the paired modules for dimension 768
k_module, v_module = hypernet.layers[768]

HypernetworkModule Forward Pass

import torch
from modules.hypernetworks.hypernetwork import HypernetworkModule

# Create a single module for dimension 768
module = HypernetworkModule(
    dim=768,
    layer_structure=[1, 2, 1],
    activation_func="relu",
    weight_init="Normal",
)

# Forward pass applies residual: x + linear(x) * multiplier
context = torch.randn(1, 77, 768)
transformed = module(context)  # shape: (1, 77, 768)

Loading an Existing Hypernetwork

from modules.hypernetworks.hypernetwork import Hypernetwork

hypernet = Hypernetwork()
hypernet.load("/path/to/my_style.pt")
# hypernet.layers now contains the loaded modules
# hypernet.name, hypernet.step, etc. are restored from the checkpoint

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment