Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:AUTOMATIC1111 Stable diffusion webui Train hypernetwork

From Leeroopedia


Knowledge Sources
Domains Deep Learning, Stable Diffusion, Training
Last Updated 2026-02-08 00:00 GMT

Overview

Concrete implementation of the hypernetwork training loop for Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. The train_hypernetwork function orchestrates the full training pipeline: dataset creation, optimizer setup, the iterative training loop with mixed-precision support, periodic checkpoint saving, and preview image generation.

Description

The train_hypernetwork() function is the main entry point for hypernetwork training. It:

  1. Validates inputs and loads the specified hypernetwork from disk.
  2. Creates the dataset using PersonalizedBase with the hypernetwork template.
  3. Configures the optimizer (AdamW by default) and learning rate scheduler.
  4. Runs the training loop with GradScaler for mixed-precision training.
  5. Supports gradient accumulation across multiple mini-batches.
  6. Optionally applies gradient clipping (value or norm mode) with a scheduled threshold.
  7. Periodically saves hypernetwork checkpoints and generates preview images.
  8. Logs loss values and training progress to CSV and optional TensorBoard.
  9. Returns the trained hypernetwork and its saved filename.

The function integrates with the WebUI's shared state system (shared.state) for progress reporting, interruption support, and current image display.

Usage

This function is called from the WebUI training tab when the user initiates hypernetwork training. It can also be invoked programmatically through the API.

Code Reference

Source Location

Signature

def train_hypernetwork(
    id_task,
    hypernetwork_name: str,
    learn_rate: float,
    batch_size: int,
    gradient_step: int,
    data_root: str,
    log_directory: str,
    training_width: int,
    training_height: int,
    varsize: bool,
    steps: int,
    clip_grad_mode: str,
    clip_grad_value: float,
    shuffle_tags: bool,
    tag_drop_out: bool,
    latent_sampling_method: str,
    use_weight: bool,
    create_image_every: int,
    save_hypernetwork_every: int,
    template_filename: str,
    preview_from_txt2img: bool,
    preview_prompt: str,
    preview_negative_prompt: str,
    preview_steps: int,
    preview_sampler_name: str,
    preview_cfg_scale: float,
    preview_seed: int,
    preview_width: int,
    preview_height: int,
) -> tuple[Hypernetwork, str]:

Import

from modules.hypernetworks.hypernetwork import train_hypernetwork

I/O Contract

Inputs

Name Type Required Description
id_task str Yes Task identifier for the WebUI progress tracking
hypernetwork_name str Yes Name of the hypernetwork to train (must exist in hypernetwork_dir)
learn_rate float Yes Learning rate schedule string (e.g., "0.005:100, 0.0005:1000")
batch_size int Yes Number of samples per mini-batch
gradient_step int Yes Number of mini-batches to accumulate before optimizer step
data_root str Yes Path to directory containing training images
log_directory str Yes Base directory for saving logs, checkpoints, and preview images
training_width int Yes Width to resize training images to (e.g., 512)
training_height int Yes Height to resize training images to (e.g., 512)
varsize bool Yes Whether to use variable-size image bucketing
steps int Yes Total number of training steps
clip_grad_mode str Yes Gradient clipping mode: "value", "norm", or empty/None for no clipping
clip_grad_value float Yes Gradient clipping threshold value
shuffle_tags bool Yes Whether to randomly shuffle comma-separated tags in prompts
tag_drop_out bool Yes Probability of dropping individual tags
latent_sampling_method str Yes Latent sampling: "once", "deterministic", or "random"
use_weight bool Yes Whether to use alpha-channel-based per-pixel loss weighting
create_image_every int Yes Generate a preview image every N steps (0 to disable)
save_hypernetwork_every int Yes Save a checkpoint every N steps (0 to disable)
template_filename str Yes Name of the prompt template file (e.g., "hypernetwork.txt")
preview_from_txt2img bool Yes Whether to use txt2img settings for preview generation
preview_prompt str Yes Prompt for preview image generation
preview_negative_prompt str Yes Negative prompt for preview image generation
preview_steps int Yes Number of sampling steps for preview generation
preview_sampler_name str Yes Sampler name for preview generation
preview_cfg_scale float Yes CFG scale for preview generation
preview_seed int Yes Seed for preview generation
preview_width int Yes Width of preview images
preview_height int Yes Height of preview images

Outputs

Name Type Description
hypernetwork Hypernetwork The trained Hypernetwork instance with updated weights and step count
filename str Path to the saved hypernetwork .pt file

Usage Examples

Basic Training Invocation

from modules.hypernetworks.hypernetwork import train_hypernetwork

hypernetwork, filename = train_hypernetwork(
    id_task="task-001",
    hypernetwork_name="my_style",
    learn_rate="0.00005:2000",
    batch_size=1,
    gradient_step=1,
    data_root="/path/to/training/images",
    log_directory="/path/to/logs",
    training_width=512,
    training_height=512,
    varsize=False,
    steps=2000,
    clip_grad_mode="norm",
    clip_grad_value=1.0,
    shuffle_tags=False,
    tag_drop_out=False,
    latent_sampling_method="once",
    use_weight=False,
    create_image_every=500,
    save_hypernetwork_every=500,
    template_filename="hypernetwork.txt",
    preview_from_txt2img=False,
    preview_prompt="",
    preview_negative_prompt="",
    preview_steps=20,
    preview_sampler_name="Euler a",
    preview_cfg_scale=7.0,
    preview_seed=-1,
    preview_width=512,
    preview_height=512,
)
print(f"Training complete. Saved to: {filename}")
print(f"Final step: {hypernetwork.step}")

Training Loop Core Logic (Internal)

# Simplified view of the inner training loop (hypernetwork.py L592-647)
scaler = torch.cuda.amp.GradScaler()

for _ in range((steps - initial_step) * gradient_step):
    for j, batch in enumerate(dl):
        scheduler.apply(optimizer, hypernetwork.step)

        with devices.autocast():
            x = batch.latent_sample.to(devices.device)
            c = stack_conds(batch.cond).to(devices.device)
            loss = shared.sd_model.forward(x, c)[0] / gradient_step
            del x, c

        scaler.scale(loss).backward()

        if (j + 1) % gradient_step != 0:
            continue

        if clip_grad:
            clip_grad(weights, clip_grad_sched.learn_rate)

        scaler.step(optimizer)
        scaler.update()
        hypernetwork.step += 1
        optimizer.zero_grad(set_to_none=True)

Related Pages

Implements Principle

Requires Environment

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment