Principle:AUTOMATIC1111 Stable diffusion webui Embedding serialization
| Knowledge Sources | |
|---|---|
| Domains | Serialization, Embedding, Steganography, Textual Inversion |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Embedding serialization is the process of persisting learned textual inversion embedding vectors and associated metadata to disk in formats that support both direct loading and steganographic embedding within preview images.
Description
Once a textual inversion embedding has been trained, it must be saved in a portable format that encodes the embedding vectors, the token mapping, and provenance metadata. The serialization system supports two distinct persistence strategies:
- Direct file serialization: The embedding is saved as a PyTorch
.ptfile usingtorch.save(), containing a dictionary with the token-to-parameter mapping, embedding name, training step count, and source checkpoint information. Optionally, the optimizer state is saved to a companion.optimfile to support training resumption. - Steganographic image embedding: The embedding data is encoded as a base64 JSON string and either stored as PNG metadata text (using the
sd-ti-embeddingkey) or steganographically encoded into the pixel data of a preview image. This allows the embedding to be distributed as a single image file that is both visually informative and functionally complete.
Usage
Use embedding serialization when:
- Saving a trained embedding for later use in image generation
- Distributing an embedding to other users as a portable file
- Creating visually informative preview images that also contain the full embedding data
- Saving optimizer state alongside the embedding for training resumption
- Recording provenance information (source model checkpoint, training step) for reproducibility
Theoretical Basis
Direct Serialization Format
The standard .pt serialization format stores a Python dictionary with the following structure:
{
"string_to_token": {"*": 265}, # token mapping (fixed)
"string_to_param": {"*": tensor}, # the learned embedding vectors
"name": "embedding_name", # human-readable name
"step": 1000, # training step count
"sd_checkpoint": "abc123...", # source model short hash
"sd_checkpoint_name": "v1-5-pruned" # source model name
}
This format is compatible with the original textual inversion implementation and can also be loaded by other Stable Diffusion frontends.
Optimizer State Persistence
For training resumption, the optimizer state is saved to a separate .optim file containing:
{
"hash": "a3f1", # 4-hex-digit checksum of the embedding
"optimizer_state_dict": {...} # full AdamW optimizer state
}
The checksum is computed using a custom hash function over the flattened embedding tensor values, enabling verification that the optimizer state matches the embedding it was saved with.
Base64 JSON Encoding
For embedding data in PNG metadata, the entire embedding dictionary (including torch tensors) is serialized using a custom JSON encoder that converts torch.Tensor objects to nested lists under a TORCHTENSOR key, then base64-encoded:
1. Convert tensors to numpy arrays, then to nested lists
2. JSON-encode the dictionary with the custom EmbeddingEncoder
3. Base64-encode the resulting JSON string
4. Store in PNG tEXt chunk under the key "sd-ti-embedding"
Steganographic Image Embedding
For full-fidelity embedding storage in image pixels, the data is:
- Compressed with zlib at maximum compression level
- Split into high and low 4-bit nibbles
- Reshaped to match the image height
- Styled with a pattern derived from the embedding's own values (first 1024 elements)
- XOR-encrypted with a deterministic LCG-based pseudo-random stream
- Placed as side panels flanking the original preview image
This approach allows extraction of the complete embedding from the image pixel data without relying on metadata preservation, which may be stripped by image hosting services.
Checksum Algorithm
The embedding checksum uses a simple multiplicative hash:
r = 0
for v in flatten(embedding_vec * 100):
r = (r * 281 ^ int(v) * 997) & 0xFFFFFFFF
checksum = r & 0xFFFF # formatted as 4-digit hex
This provides a lightweight integrity check for matching embeddings with their optimizer state files.