Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Zai org CogVideo Load Lora Weights Fuse

From Leeroopedia


Implementation Metadata
Name Load_Lora_Weights_Fuse
Type Wrapper Doc
Category Inference
Domains Video_Generation, Fine_Tuning, Diffusion_Models
Knowledge Sources CogVideo Repository, LoRA Paper, Diffusers LoRA Loading Guide
Last Updated 2026-02-10 00:00 GMT

Overview

Load_Lora_Weights_Fuse is a concrete tool for loading and fusing LoRA weights into CogVideoX pipelines for inference, provided by the diffusers library.

Description

This implementation wraps the Diffusers pipeline's built-in LoRA loading and fusion capabilities for use with CogVideoX models. The load_lora_weights method loads trained LoRA adapter weights from a .safetensors file and attaches them to the pipeline's transformer component. The fuse_lora method merges the adapter weights into the base model weights for faster inference. Multiple adapters can be managed using named adapters and the set_adapters method.

Usage

Use after LoRA fine-tuning is complete to generate videos with the adapted model. This implementation is used in both the tools/load_cogvideox_lora.py utility script and the inference/cli_demo.py CLI tool.

Code Reference

Source Location

  • tools/load_cogvideox_lora.py:L86-132 -- LoRA loading utility
  • inference/cli_demo.py:L128-132 -- CLI inference with LoRA support

Signature

Loading LoRA weights:

pipe.load_lora_weights(
    lora_path,
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="test_1",
)

Fusing LoRA weights:

pipe.fuse_lora(
    components=["transformer"],
    lora_scale=1.0,
)

Multi-adapter composition:

pipe.set_adapters(
    adapter_names=["adapter_1", "adapter_2"],
    adapter_weights=[1.0, 0.5],
)

Import

from diffusers import CogVideoXPipeline

Key Parameters

Parameter Type Default Description
lora_path str or Path required Directory containing the LoRA adapter weights file.
weight_name str "pytorch_lora_weights.safetensors" Filename of the LoRA weights within the directory.
adapter_name str "default" Name for the adapter (used for multi-adapter management).
lora_scale float 1.0 Scaling factor for LoRA fusion (0.0 = no adaptation, 1.0 = full).
components List[str] ["transformer"] Pipeline components to apply LoRA fusion to.

External Dependencies

  • diffusers -- Pipeline LoRA mixin (load_lora_weights, fuse_lora, set_adapters)
  • peft -- Low-rank adapter infrastructure

External Documentation

I/O Contract

Inputs

Input Format Description
Base pretrained model HuggingFace model ID or local path The CogVideoX base model pipeline (e.g., "THUDM/CogVideoX-5b").
LoRA adapter weights .safetensors file Trained LoRA adapter weights (typically pytorch_lora_weights.safetensors).

Outputs

Output Format Description
Adapted pipeline CogVideoXPipeline Pipeline with LoRA weights loaded (unfused) or fused, ready for video generation.
Generated videos Tensor or exported .mp4 files Videos produced by the adapted pipeline via the __call__ method.

Usage Examples

Basic LoRA Loading and Inference

import torch
from diffusers import CogVideoXPipeline
from diffusers.utils import export_to_video

# Load base pipeline
pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-5b",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

# Load LoRA adapter
pipe.load_lora_weights(
    "/output/lora_run",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="my_adapter",
)

# Fuse for faster inference
pipe.fuse_lora(components=["transformer"], lora_scale=1.0)

# Generate video
video = pipe(
    prompt="A cat playing with a ball in a garden",
    num_frames=49,
    guidance_scale=6.0,
    num_inference_steps=50,
).frames[0]

export_to_video(video, "output.mp4", fps=15)

Multi-Adapter Composition

import torch
from diffusers import CogVideoXPipeline

pipe = CogVideoXPipeline.from_pretrained(
    "THUDM/CogVideoX-5b",
    torch_dtype=torch.bfloat16,
)
pipe.to("cuda")

# Load first adapter (style)
pipe.load_lora_weights(
    "/output/style_lora",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="style",
)

# Load second adapter (subject)
pipe.load_lora_weights(
    "/output/subject_lora",
    weight_name="pytorch_lora_weights.safetensors",
    adapter_name="subject",
)

# Combine adapters with different weights
pipe.set_adapters(
    adapter_names=["style", "subject"],
    adapter_weights=[1.0, 0.8],
)

# Generate with composed adapters
video = pipe(
    prompt="A golden retriever running on the beach, watercolor style",
    num_frames=49,
    guidance_scale=6.0,
    num_inference_steps=50,
).frames[0]

Unfused Dynamic Adapter Switching

# Without fusing, adapters can be dynamically switched
pipe.load_lora_weights(lora_path_1, adapter_name="adapter_1")
pipe.load_lora_weights(lora_path_2, adapter_name="adapter_2")

# Use adapter_1
pipe.set_adapters(["adapter_1"], [1.0])
video_1 = pipe(prompt="...").frames[0]

# Switch to adapter_2
pipe.set_adapters(["adapter_2"], [1.0])
video_2 = pipe(prompt="...").frames[0]

# Use both
pipe.set_adapters(["adapter_1", "adapter_2"], [0.7, 0.3])
video_combined = pipe(prompt="...").frames[0]

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment