Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:AUTOMATIC1111 Stable diffusion webui Hypernetwork save and apply

From Leeroopedia
Revision as of 14:03, 16 February 2026 by Admin (talk | contribs) (Auto-imported from implementations/AUTOMATIC1111_Stable_diffusion_webui_Hypernetwork_save_and_apply.md)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


Knowledge Sources
Domains Deep Learning, Stable Diffusion, Model Deployment
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete implementation of hypernetwork saving, loading, and inference-time application for Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. This covers the Hypernetwork.save() and Hypernetwork.load() methods for checkpoint persistence, and the apply_hypernetworks() function that intercepts cross-attention to apply learned transformations during image generation.

Description

Hypernetwork.save(filename) serializes the full hypernetwork state to a .pt file. It stores paired module state dictionaries keyed by attention dimension, along with all configuration metadata (layer structure, activation function, weight init, dropout, etc.) and training progress (step count, checkpoint info). If shared.opts.save_optimizer_state is enabled and an optimizer state exists, it is saved separately to filename.optim with a hash for verification.

Hypernetwork.load(filename) deserializes a .pt file, reconstructing the layer structure, activation, dropout, and other configuration from stored metadata. It creates new HypernetworkModule pairs for each attention dimension using the saved state dictionaries. It also attempts to load the optimizer state from filename.optim if the hash matches.

apply_hypernetworks(hypernetworks, context, layer) is the inference-time hook that iterates over all loaded hypernetworks and applies each one's K and V modules to the context tensor. It is called from the hijacked attention_CrossAttention_forward() function, which replaces the original cross-attention implementation to inject hypernetwork transformations.

Usage

These functions are used throughout the hypernetwork lifecycle: after training to save checkpoints, at startup to load hypernetworks selected by the user, and during every image generation step to apply the transformations.

Code Reference

Source Location

  • Repository: stable-diffusion-webui
  • File: modules/hypernetworks/hypernetwork.py
  • Lines: L213-237 (save), L243-304 (load), L306-309 (shorthash), L358-379 (apply_single_hypernetwork, apply_hypernetworks)

Signature

class Hypernetwork:
    def save(self, filename):
        """Serialize hypernetwork state to a .pt file, optionally saving optimizer state."""

    def load(self, filename):
        """Deserialize hypernetwork from a .pt file, reconstructing modules and loading optimizer."""

    def shorthash(self):
        """Return first 10 chars of SHA256 hash of the hypernetwork file."""

def apply_single_hypernetwork(hypernetwork, context_k, context_v, layer=None):
    """Apply one hypernetwork's K and V modules to context tensors."""

def apply_hypernetworks(hypernetworks, context, layer=None):
    """Apply all loaded hypernetworks sequentially to a context tensor, returning (context_k, context_v)."""

Import

from modules.hypernetworks.hypernetwork import Hypernetwork, apply_hypernetworks

I/O Contract

Inputs (Hypernetwork.save)

Name Type Required Description
filename str Yes Absolute path for the output .pt file (e.g., /models/hypernetworks/my_style.pt)

Inputs (Hypernetwork.load)

Name Type Required Description
filename str Yes Absolute path to the .pt file to load

Inputs (apply_hypernetworks)

Name Type Required Description
hypernetworks list[Hypernetwork] Yes List of loaded hypernetwork instances to apply sequentially
context Tensor Yes The cross-attention context tensor of shape (batch, seq_len, dim)
layer object or None No The cross-attention layer object, used to store hyper_k/hyper_v references (default: None)

Outputs (Hypernetwork.save)

Name Type Description
(None) None Writes .pt file to disk; optionally writes .pt.optim file

Outputs (Hypernetwork.load)

Name Type Description
(None) None Populates self.layers, self.name, self.step, and all configuration attributes from the file

Outputs (apply_hypernetworks)

Name Type Description
(context_k, context_v) tuple[Tensor, Tensor] Transformed K and V context tensors, same shape as input context

Usage Examples

Saving a Trained Hypernetwork

from modules.hypernetworks.hypernetwork import Hypernetwork

# After training...
hypernetwork = Hypernetwork(
    name="my_style",
    enable_sizes=[320, 640, 768, 1280],
    layer_structure=[1, 2, 1],
    activation_func="relu",
)
# ... training happens ...

# Save to disk
hypernetwork.save("/models/hypernetworks/my_style.pt")
# Creates: /models/hypernetworks/my_style.pt
# Optionally: /models/hypernetworks/my_style.pt.optim (if optimizer state saving is enabled)

Loading and Applying at Inference

from modules.hypernetworks.hypernetwork import Hypernetwork, apply_hypernetworks

# Load from disk
hypernet = Hypernetwork()
hypernet.load("/models/hypernetworks/my_style.pt")
hypernet.set_multiplier(0.8)  # 80% strength

# Apply to cross-attention context during inference
import torch
context = torch.randn(1, 77, 768)  # (batch, seq_len, dim)
context_k, context_v = apply_hypernetworks([hypernet], context)
# context_k and context_v are now transformed by the hypernetwork

Stacking Multiple Hypernetworks

from modules.hypernetworks.hypernetwork import load_hypernetworks
from modules import shared

# Load multiple hypernetworks with individual multipliers
load_hypernetworks(
    names=["style_a", "style_b"],
    multipliers=[1.0, 0.5],
)
# shared.loaded_hypernetworks now contains both hypernetworks
# They will be applied sequentially during image generation

How the Hijacked Cross-Attention Uses apply_hypernetworks

# From attention_CrossAttention_forward (hypernetwork.py L382-407)
def attention_CrossAttention_forward(self, x, context=None, mask=None, **kwargs):
    h = self.heads
    q = self.to_q(x)
    context = default(context, x)

    # Hypernetwork injection point: transform context before K/V projection
    context_k, context_v = apply_hypernetworks(shared.loaded_hypernetworks, context, self)
    k = self.to_k(context_k)
    v = self.to_v(context_v)

    # Standard attention computation follows...
    q, k, v = (rearrange(t, 'b n (h d) -> (b h) n d', h=h) for t in (q, k, v))
    sim = einsum('b i d, b j d -> b i j', q, k) * self.scale
    attn = sim.softmax(dim=-1)
    out = einsum('b i j, b j d -> b i d', attn, v)
    out = rearrange(out, '(b h) n d -> b n (h d)', h=h)
    return self.to_out(out)

Related Pages

Implements Principle

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment