Implementation:AUTOMATIC1111 Stable diffusion webui Create embedding
| Knowledge Sources | |
|---|---|
| Domains | Textual Inversion, Embedding, Stable Diffusion, Generative AI |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for creating a new textual inversion embedding file initialized from existing CLIP token embeddings, provided by the AUTOMATIC1111 stable-diffusion-webui repository.
Description
The create_embedding function allocates a new embedding vector (or set of vectors) for a custom placeholder token, initializes it by encoding a given init_text through the CLIP conditioning model, sanitizes the embedding name, and saves the result as a .pt file. It also provides the Embedding class which encapsulates the embedding tensor, metadata, and serialization logic.
The function works by:
- Sending the conditioning model to GPU if needed via a dummy forward pass
- Encoding the
init_text(defaulting to"*") throughcond_model.encode_embedding_init_textto get initial vectors - Distributing the encoded vectors evenly across
num_vectors_per_tokenslots in a zero-initialized tensor - Sanitizing the name by removing illegal characters (keeping only alphanumeric,
._-) - Saving the embedding to
embeddings_dir/{name}.ptvia theEmbedding.save()method
Usage
Use this function when:
- Creating a new textual inversion embedding before training begins
- Initializing an embedding from a semantically meaningful seed word
- Setting up a multi-vector embedding for higher-fidelity concept capture
Code Reference
Source Location
- Repository: stable-diffusion-webui
- File:
modules/textual_inversion/textual_inversion.py - Lines: L259-284 (create_embedding), L39-89 (Embedding class)
Signature
def create_embedding(name, num_vectors_per_token, overwrite_old, init_text='*'):
...
return fn # str: path to saved .pt file
class Embedding:
def __init__(self, vec, name, step=None):
self.vec = vec # torch.Tensor of shape (num_vectors, embedding_dim)
self.name = name # str
self.step = step # Optional[int]
self.shape = None
self.vectors = 0
self.cached_checksum = None
self.sd_checkpoint = None
self.sd_checkpoint_name = None
self.optimizer_state_dict = None
self.filename = None
self.hash = None
self.shorthash = None
def save(self, filename): ...
def checksum(self): ...
def set_hash(self, v): ...
Import
from modules.textual_inversion.textual_inversion import create_embedding, Embedding
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| name | str | Yes | Name for the new embedding; illegal characters are stripped (only alphanumeric, ._- retained)
|
| num_vectors_per_token | int | Yes | Number of embedding vectors per token (typically 1-8); higher values increase representational capacity |
| overwrite_old | bool | Yes | If False, raises AssertionError when an embedding file with the same name already exists |
| init_text | str | No | Text used to initialize the embedding vectors via the CLIP encoder; defaults to '*'. If empty string, vectors remain zeros
|
Outputs
| Name | Type | Description |
|---|---|---|
| fn | str | Absolute file path to the saved .pt embedding file in the embeddings directory
|
Usage Examples
Basic Usage
from modules.textual_inversion.textual_inversion import create_embedding
# Create a 4-vector embedding initialized from the word "cat"
filepath = create_embedding(
name="my-cat-concept",
num_vectors_per_token=4,
overwrite_old=False,
init_text="cat"
)
print(f"Embedding saved to: {filepath}")
Creating with Default Initialization
# Create a single-vector embedding with wildcard initialization
filepath = create_embedding(
name="my_style",
num_vectors_per_token=1,
overwrite_old=True
)
# init_text defaults to '*', providing a neutral starting point
Working with the Embedding Class Directly
import torch
from modules.textual_inversion.textual_inversion import Embedding
# Create an Embedding instance with a random vector
vec = torch.randn(4, 768) # 4 vectors, 768-dim for SD 1.x CLIP
emb = Embedding(vec, name="test_embedding", step=0)
# Save it
emb.save("/path/to/embeddings/test_embedding.pt")
# Compute a 4-hex-digit checksum
print(emb.checksum()) # e.g., "a3f1"