Implementation:AUTOMATIC1111 Stable diffusion webui Train hypernetwork
| Knowledge Sources | |
|---|---|
| Domains | Deep Learning, Stable Diffusion, Training |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete implementation of the hypernetwork training loop for Stable Diffusion, provided by the AUTOMATIC1111 stable-diffusion-webui repository. The train_hypernetwork function orchestrates the full training pipeline: dataset creation, optimizer setup, the iterative training loop with mixed-precision support, periodic checkpoint saving, and preview image generation.
Description
The train_hypernetwork() function is the main entry point for hypernetwork training. It:
- Validates inputs and loads the specified hypernetwork from disk.
- Creates the dataset using
PersonalizedBasewith the hypernetwork template. - Configures the optimizer (AdamW by default) and learning rate scheduler.
- Runs the training loop with
GradScalerfor mixed-precision training. - Supports gradient accumulation across multiple mini-batches.
- Optionally applies gradient clipping (value or norm mode) with a scheduled threshold.
- Periodically saves hypernetwork checkpoints and generates preview images.
- Logs loss values and training progress to CSV and optional TensorBoard.
- Returns the trained hypernetwork and its saved filename.
The function integrates with the WebUI's shared state system (shared.state) for progress reporting, interruption support, and current image display.
Usage
This function is called from the WebUI training tab when the user initiates hypernetwork training. It can also be invoked programmatically through the API.
Code Reference
Source Location
- Repository: stable-diffusion-webui
- File:
modules/hypernetworks/hypernetwork.py - Lines: L472-768
Signature
def train_hypernetwork(
id_task,
hypernetwork_name: str,
learn_rate: float,
batch_size: int,
gradient_step: int,
data_root: str,
log_directory: str,
training_width: int,
training_height: int,
varsize: bool,
steps: int,
clip_grad_mode: str,
clip_grad_value: float,
shuffle_tags: bool,
tag_drop_out: bool,
latent_sampling_method: str,
use_weight: bool,
create_image_every: int,
save_hypernetwork_every: int,
template_filename: str,
preview_from_txt2img: bool,
preview_prompt: str,
preview_negative_prompt: str,
preview_steps: int,
preview_sampler_name: str,
preview_cfg_scale: float,
preview_seed: int,
preview_width: int,
preview_height: int,
) -> tuple[Hypernetwork, str]:
Import
from modules.hypernetworks.hypernetwork import train_hypernetwork
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| id_task | str | Yes | Task identifier for the WebUI progress tracking |
| hypernetwork_name | str | Yes | Name of the hypernetwork to train (must exist in hypernetwork_dir) |
| learn_rate | float | Yes | Learning rate schedule string (e.g., "0.005:100, 0.0005:1000") |
| batch_size | int | Yes | Number of samples per mini-batch |
| gradient_step | int | Yes | Number of mini-batches to accumulate before optimizer step |
| data_root | str | Yes | Path to directory containing training images |
| log_directory | str | Yes | Base directory for saving logs, checkpoints, and preview images |
| training_width | int | Yes | Width to resize training images to (e.g., 512) |
| training_height | int | Yes | Height to resize training images to (e.g., 512) |
| varsize | bool | Yes | Whether to use variable-size image bucketing |
| steps | int | Yes | Total number of training steps |
| clip_grad_mode | str | Yes | Gradient clipping mode: "value", "norm", or empty/None for no clipping |
| clip_grad_value | float | Yes | Gradient clipping threshold value |
| shuffle_tags | bool | Yes | Whether to randomly shuffle comma-separated tags in prompts |
| tag_drop_out | bool | Yes | Probability of dropping individual tags |
| latent_sampling_method | str | Yes | Latent sampling: "once", "deterministic", or "random" |
| use_weight | bool | Yes | Whether to use alpha-channel-based per-pixel loss weighting |
| create_image_every | int | Yes | Generate a preview image every N steps (0 to disable) |
| save_hypernetwork_every | int | Yes | Save a checkpoint every N steps (0 to disable) |
| template_filename | str | Yes | Name of the prompt template file (e.g., "hypernetwork.txt") |
| preview_from_txt2img | bool | Yes | Whether to use txt2img settings for preview generation |
| preview_prompt | str | Yes | Prompt for preview image generation |
| preview_negative_prompt | str | Yes | Negative prompt for preview image generation |
| preview_steps | int | Yes | Number of sampling steps for preview generation |
| preview_sampler_name | str | Yes | Sampler name for preview generation |
| preview_cfg_scale | float | Yes | CFG scale for preview generation |
| preview_seed | int | Yes | Seed for preview generation |
| preview_width | int | Yes | Width of preview images |
| preview_height | int | Yes | Height of preview images |
Outputs
| Name | Type | Description |
|---|---|---|
| hypernetwork | Hypernetwork | The trained Hypernetwork instance with updated weights and step count |
| filename | str | Path to the saved hypernetwork .pt file |
Usage Examples
Basic Training Invocation
from modules.hypernetworks.hypernetwork import train_hypernetwork
hypernetwork, filename = train_hypernetwork(
id_task="task-001",
hypernetwork_name="my_style",
learn_rate="0.00005:2000",
batch_size=1,
gradient_step=1,
data_root="/path/to/training/images",
log_directory="/path/to/logs",
training_width=512,
training_height=512,
varsize=False,
steps=2000,
clip_grad_mode="norm",
clip_grad_value=1.0,
shuffle_tags=False,
tag_drop_out=False,
latent_sampling_method="once",
use_weight=False,
create_image_every=500,
save_hypernetwork_every=500,
template_filename="hypernetwork.txt",
preview_from_txt2img=False,
preview_prompt="",
preview_negative_prompt="",
preview_steps=20,
preview_sampler_name="Euler a",
preview_cfg_scale=7.0,
preview_seed=-1,
preview_width=512,
preview_height=512,
)
print(f"Training complete. Saved to: {filename}")
print(f"Final step: {hypernetwork.step}")
Training Loop Core Logic (Internal)
# Simplified view of the inner training loop (hypernetwork.py L592-647)
scaler = torch.cuda.amp.GradScaler()
for _ in range((steps - initial_step) * gradient_step):
for j, batch in enumerate(dl):
scheduler.apply(optimizer, hypernetwork.step)
with devices.autocast():
x = batch.latent_sample.to(devices.device)
c = stack_conds(batch.cond).to(devices.device)
loss = shared.sd_model.forward(x, c)[0] / gradient_step
del x, c
scaler.scale(loss).backward()
if (j + 1) % gradient_step != 0:
continue
if clip_grad:
clip_grad(weights, clip_grad_sched.learn_rate)
scaler.step(optimizer)
scaler.update()
hypernetwork.step += 1
optimizer.zero_grad(set_to_none=True)