Principle:AUTOMATIC1111 Stable diffusion webui Embedding serialization

Knowledge Sources	An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion PNG Specification - Ancillary Chunks (tEXt)
Domains	Serialization, Embedding, Steganography, Textual Inversion
Last Updated	2026-02-08 00:00 GMT

Overview

Embedding serialization is the process of persisting learned textual inversion embedding vectors and associated metadata to disk in formats that support both direct loading and steganographic embedding within preview images.

Description

Once a textual inversion embedding has been trained, it must be saved in a portable format that encodes the embedding vectors, the token mapping, and provenance metadata. The serialization system supports two distinct persistence strategies:

Direct file serialization: The embedding is saved as a PyTorch .pt file using torch.save(), containing a dictionary with the token-to-parameter mapping, embedding name, training step count, and source checkpoint information. Optionally, the optimizer state is saved to a companion .optim file to support training resumption.
Steganographic image embedding: The embedding data is encoded as a base64 JSON string and either stored as PNG metadata text (using the sd-ti-embedding key) or steganographically encoded into the pixel data of a preview image. This allows the embedding to be distributed as a single image file that is both visually informative and functionally complete.

Usage

Use embedding serialization when:

Saving a trained embedding for later use in image generation
Distributing an embedding to other users as a portable file
Creating visually informative preview images that also contain the full embedding data
Saving optimizer state alongside the embedding for training resumption
Recording provenance information (source model checkpoint, training step) for reproducibility

Theoretical Basis

Direct Serialization Format

The standard .pt serialization format stores a Python dictionary with the following structure:

{
    "string_to_token": {"*": 265},         # token mapping (fixed)
    "string_to_param": {"*": tensor},       # the learned embedding vectors
    "name": "embedding_name",               # human-readable name
    "step": 1000,                           # training step count
    "sd_checkpoint": "abc123...",            # source model short hash
    "sd_checkpoint_name": "v1-5-pruned"     # source model name
}

This format is compatible with the original textual inversion implementation and can also be loaded by other Stable Diffusion frontends.

Optimizer State Persistence

For training resumption, the optimizer state is saved to a separate .optim file containing:

{
    "hash": "a3f1",                    # 4-hex-digit checksum of the embedding
    "optimizer_state_dict": {...}      # full AdamW optimizer state
}

The checksum is computed using a custom hash function over the flattened embedding tensor values, enabling verification that the optimizer state matches the embedding it was saved with.

Base64 JSON Encoding

For embedding data in PNG metadata, the entire embedding dictionary (including torch tensors) is serialized using a custom JSON encoder that converts torch.Tensor objects to nested lists under a TORCHTENSOR key, then base64-encoded:

1. Convert tensors to numpy arrays, then to nested lists
2. JSON-encode the dictionary with the custom EmbeddingEncoder
3. Base64-encode the resulting JSON string
4. Store in PNG tEXt chunk under the key "sd-ti-embedding"

Steganographic Image Embedding

For full-fidelity embedding storage in image pixels, the data is:

Compressed with zlib at maximum compression level
Split into high and low 4-bit nibbles
Reshaped to match the image height
Styled with a pattern derived from the embedding's own values (first 1024 elements)
XOR-encrypted with a deterministic LCG-based pseudo-random stream
Placed as side panels flanking the original preview image

This approach allows extraction of the complete embedding from the image pixel data without relying on metadata preservation, which may be stripped by image hosting services.

Checksum Algorithm

The embedding checksum uses a simple multiplicative hash:

r = 0
for v in flatten(embedding_vec * 100):
    r = (r * 281 ^ int(v) * 997) & 0xFFFFFFFF
checksum = r & 0xFFFF  # formatted as 4-digit hex

This provides a lightweight integrity check for matching embeddings with their optimizer state files.

Related Pages

Implemented By

Implementation:AUTOMATIC1111_Stable_diffusion_webui_Embedding_save

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment