Implementation:Zai org CogVideo Load Lora Weights Fuse

Implementation Metadata
Name	Load_Lora_Weights_Fuse
Type	Wrapper Doc
Category	Inference
Domains	Video_Generation, Fine_Tuning, Diffusion_Models
Knowledge Sources	CogVideo Repository, LoRA Paper, Diffusers LoRA Loading Guide
Last Updated	2026-02-10 00:00 GMT

Overview

Load_Lora_Weights_Fuse is a concrete tool for loading and fusing LoRA weights into CogVideoX pipelines for inference, provided by the diffusers library.

Description

This implementation wraps the Diffusers pipeline's built-in LoRA loading and fusion capabilities for use with CogVideoX models. The load_lora_weights method loads trained LoRA adapter weights from a .safetensors file and attaches them to the pipeline's transformer component. The fuse_lora method merges the adapter weights into the base model weights for faster inference. Multiple adapters can be managed using named adapters and the set_adapters method.

Usage

Use after LoRA fine-tuning is complete to generate videos with the adapted model. This implementation is used in both the tools/load_cogvideox_lora.py utility script and the inference/cli_demo.py CLI tool.

Code Reference

Source Location

tools/load_cogvideox_lora.py:L86-132 -- LoRA loading utility
inference/cli_demo.py:L128-132 -- CLI inference with LoRA support

Signature

Loading LoRA weights:

pipe.load_lora_weights(
    lora_path,
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="test_1",
)

Fusing LoRA weights:

pipe.fuse_lora(
    components=["transformer"],
    lora_scale=1.0,
)

Multi-adapter composition:

pipe.set_adapters(
    adapter_names=["adapter_1", "adapter_2"],
    adapter_weights=[1.0, 0.5],
)

Import

from diffusers import CogVideoXPipeline

Key Parameters

Parameter	Type	Default	Description
`lora_path`	`str` or `Path`	required	Directory containing the LoRA adapter weights file.
`weight_name`	`str`	`"pytorch_lora_weights.safetensors"`	Filename of the LoRA weights within the directory.
`adapter_name`	`str`	`"default"`	Name for the adapter (used for multi-adapter management).
`lora_scale`	`float`	`1.0`	Scaling factor for LoRA fusion (0.0 = no adaptation, 1.0 = full).
`components`	`List[str]`	`["transformer"]`	Pipeline components to apply LoRA fusion to.

External Dependencies

diffusers -- Pipeline LoRA mixin (load_lora_weights, fuse_lora, set_adapters)
peft -- Low-rank adapter infrastructure

External Documentation

Diffusers: Loading Adapters Guide

I/O Contract

Inputs

Input	Format	Description
Base pretrained model	HuggingFace model ID or local path	The CogVideoX base model pipeline (e.g., `"THUDM/CogVideoX-5b"`).
LoRA adapter weights	`.safetensors` file	Trained LoRA adapter weights (typically `pytorch_lora_weights.safetensors`).

Outputs

Output	Format	Description
Adapted pipeline	`CogVideoXPipeline`	Pipeline with LoRA weights loaded (unfused) or fused, ready for video generation.
Generated videos	Tensor or exported `.mp4` files	Videos produced by the adapted pipeline via the `__call__` method.

Usage Examples

Basic LoRA Loading and Inference

import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video

# Load base pipeline
pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-5b",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

# Load LoRA adapter
pipe.load_lora_weights(
    "/output/lora_run",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="my_adapter",
)

# Fuse for faster inference
pipe.fuse_lora(components=["transformer"], lora_scale=1.0)

# Generate video
video = pipe(
    prompt="A cat playing with a ball in a garden",
    num_frames=49,
    guidance_scale=6.0,
    num_inference_steps=50,
).frames[0]

export_to_video(video, "output.mp4", fps=15)

Multi-Adapter Composition

import torch
from diffusers import CogVideoXPipeline

pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-5b",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

# Load first adapter (style)
pipe.load_lora_weights(
    "/output/style_lora",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="style",
)

# Load second adapter (subject)
pipe.load_lora_weights(
    "/output/subject_lora",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="subject",
)

# Combine adapters with different weights
pipe.set_adapters(
    adapter_names=["style", "subject"],
    adapter_weights=[1.0, 0.8],
)

# Generate with composed adapters
video = pipe(
    prompt="A golden retriever running on the beach, watercolor style",
    num_frames=49,
    guidance_scale=6.0,
    num_inference_steps=50,
).frames[0]

Unfused Dynamic Adapter Switching

# Without fusing, adapters can be dynamically switched
pipe.load_lora_weights(lora_path_1, adapter_name="adapter_1")
pipe.load_lora_weights(lora_path_2, adapter_name="adapter_2")

# Use adapter_1
pipe.set_adapters(["adapter_1"], [1.0])
video_1 = pipe(prompt="...").frames[0]

# Switch to adapter_2
pipe.set_adapters(["adapter_2"], [1.0])
video_2 = pipe(prompt="...").frames[0]

# Use both
pipe.set_adapters(["adapter_1", "adapter_2"], [0.7, 0.3])
video_combined = pipe(prompt="...").frames[0]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment