Implementation:Zai org CogVideo Load Lora Weights Fuse
| Implementation Metadata | |
|---|---|
| Name | Load_Lora_Weights_Fuse |
| Type | Wrapper Doc |
| Category | Inference |
| Domains | Video_Generation, Fine_Tuning, Diffusion_Models |
| Knowledge Sources | CogVideo Repository, LoRA Paper, Diffusers LoRA Loading Guide |
| Last Updated | 2026-02-10 00:00 GMT |
Overview
Load_Lora_Weights_Fuse is a concrete tool for loading and fusing LoRA weights into CogVideoX pipelines for inference, provided by the diffusers library.
Description
This implementation wraps the Diffusers pipeline's built-in LoRA loading and fusion capabilities for use with CogVideoX models. The load_lora_weights method loads trained LoRA adapter weights from a .safetensors file and attaches them to the pipeline's transformer component. The fuse_lora method merges the adapter weights into the base model weights for faster inference. Multiple adapters can be managed using named adapters and the set_adapters method.
Usage
Use after LoRA fine-tuning is complete to generate videos with the adapted model. This implementation is used in both the tools/load_cogvideox_lora.py utility script and the inference/cli_demo.py CLI tool.
Code Reference
Source Location
tools/load_cogvideox_lora.py:L86-132-- LoRA loading utilityinference/cli_demo.py:L128-132-- CLI inference with LoRA support
Signature
Loading LoRA weights:
pipe.load_lora_weights(
lora_path,
weight_name="pytorch_lora_weights.safetensors",
adapter_name="test_1",
)
Fusing LoRA weights:
pipe.fuse_lora(
components=["transformer"],
lora_scale=1.0,
)
Multi-adapter composition:
pipe.set_adapters(
adapter_names=["adapter_1", "adapter_2"],
adapter_weights=[1.0, 0.5],
)
Import
from diffusers import CogVideoXPipeline
Key Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
lora_path |
str or Path |
required | Directory containing the LoRA adapter weights file. |
weight_name |
str |
"pytorch_lora_weights.safetensors" |
Filename of the LoRA weights within the directory. |
adapter_name |
str |
"default" |
Name for the adapter (used for multi-adapter management). |
lora_scale |
float |
1.0 |
Scaling factor for LoRA fusion (0.0 = no adaptation, 1.0 = full). |
components |
List[str] |
["transformer"] |
Pipeline components to apply LoRA fusion to. |
External Dependencies
diffusers-- Pipeline LoRA mixin (load_lora_weights,fuse_lora,set_adapters)peft-- Low-rank adapter infrastructure
External Documentation
I/O Contract
Inputs
| Input | Format | Description |
|---|---|---|
| Base pretrained model | HuggingFace model ID or local path | The CogVideoX base model pipeline (e.g., "THUDM/CogVideoX-5b").
|
| LoRA adapter weights | .safetensors file |
Trained LoRA adapter weights (typically pytorch_lora_weights.safetensors).
|
Outputs
| Output | Format | Description |
|---|---|---|
| Adapted pipeline | CogVideoXPipeline |
Pipeline with LoRA weights loaded (unfused) or fused, ready for video generation. |
| Generated videos | Tensor or exported .mp4 files |
Videos produced by the adapted pipeline via the __call__ method.
|
Usage Examples
Basic LoRA Loading and Inference
import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video
# Load base pipeline
pipe = CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-5b",
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
# Load LoRA adapter
pipe.load_lora_weights(
"/output/lora_run",
weight_name="pytorch_lora_weights.safetensors",
adapter_name="my_adapter",
)
# Fuse for faster inference
pipe.fuse_lora(components=["transformer"], lora_scale=1.0)
# Generate video
video = pipe(
prompt="A cat playing with a ball in a garden",
num_frames=49,
guidance_scale=6.0,
num_inference_steps=50,
).frames[0]
export_to_video(video, "output.mp4", fps=15)
Multi-Adapter Composition
import torch
from diffusers import CogVideoXPipeline
pipe = CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-5b",
torch_dtype=torch.bfloat16,
)
pipe.to("cuda")
# Load first adapter (style)
pipe.load_lora_weights(
"/output/style_lora",
weight_name="pytorch_lora_weights.safetensors",
adapter_name="style",
)
# Load second adapter (subject)
pipe.load_lora_weights(
"/output/subject_lora",
weight_name="pytorch_lora_weights.safetensors",
adapter_name="subject",
)
# Combine adapters with different weights
pipe.set_adapters(
adapter_names=["style", "subject"],
adapter_weights=[1.0, 0.8],
)
# Generate with composed adapters
video = pipe(
prompt="A golden retriever running on the beach, watercolor style",
num_frames=49,
guidance_scale=6.0,
num_inference_steps=50,
).frames[0]
Unfused Dynamic Adapter Switching
# Without fusing, adapters can be dynamically switched
pipe.load_lora_weights(lora_path_1, adapter_name="adapter_1")
pipe.load_lora_weights(lora_path_2, adapter_name="adapter_2")
# Use adapter_1
pipe.set_adapters(["adapter_1"], [1.0])
video_1 = pipe(prompt="...").frames[0]
# Switch to adapter_2
pipe.set_adapters(["adapter_2"], [1.0])
video_2 = pipe(prompt="...").frames[0]
# Use both
pipe.set_adapters(["adapter_1", "adapter_2"], [0.7, 0.3])
video_combined = pipe(prompt="...").frames[0]