Implementation:Zai org CogVideo I2V Pipeline Configuration

Metadata

Field	Value
Page Type	Implementation (Wrapper Doc)
Knowledge Sources	Repo (CogVideo), Paper (CogVideoX)
Domains	Video_Generation, Diffusion_Models, Image_Conditioning
Last Updated	2026-02-10 00:00 GMT

Overview

Concrete tool for configuring scheduler and memory settings on the CogVideoX I2V pipeline provided by the diffusers library.

Description

After loading the I2V pipeline via from_pretrained, three configuration steps are applied before inference:

Scheduler replacement: The default scheduler is replaced with CogVideoXDPMScheduler configured with timestep_spacing="trailing". This provides optimal sample quality for the CogVideoX-5B model family.
Sequential CPU offloading: Enabled via pipe.enable_sequential_cpu_offload(), which moves model components to CPU when not in use, minimizing peak GPU memory.
VAE memory optimization: Both pipe.vae.enable_slicing() and pipe.vae.enable_tiling() are called to reduce peak memory during VAE encoding and decoding operations.

These configuration steps are identical to the T2V pipeline configuration and are applied to the same pipe object after loading.

Usage

Apply immediately after loading the I2V pipeline and before calling the pipeline for video generation. The configuration modifies the pipeline in-place.

Code Reference

Source Location

inference/cli_demo.py, lines 140-152.

Signature

# Scheduler configuration
pipe.scheduler = CogVideoXDPMScheduler.from_config(
    pipe.scheduler.config, timestep_spacing="trailing"
)

# Memory optimization
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()

Import

from diffusers import CogVideoXDPMScheduler

I/O Contract

Inputs

Parameter	Type	Required	Description
`pipe`	`CogVideoXImageToVideoPipeline`	Yes	A loaded I2V pipeline instance from `from_pretrained`. Modified in-place.
`timestep_spacing`	str	Yes	Timestep spacing strategy for the DPM scheduler. Set to `"trailing"` for optimal quality with CogVideoX models.

Outputs

Output	Type	Description
Configured pipeline	`CogVideoXImageToVideoPipeline`	The same pipeline instance with scheduler replaced and memory optimizations enabled. Ready for video generation.

Usage Examples

Standard I2V Pipeline Configuration

import torch
from diffusers import CogVideoXImageToVideoPipeline, CogVideoXDPMScheduler

# Load pipeline
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "THUDM/CogVideoX-5b-I2V",
    torch_dtype=torch.bfloat16,
)

# Configure scheduler
pipe.scheduler = CogVideoXDPMScheduler.from_config(
    pipe.scheduler.config, timestep_spacing="trailing"
)

# Enable memory optimizations
pipe.enable_sequential_cpu_offload()
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()

Multi-GPU or High-VRAM Alternative

import torch
from diffusers import CogVideoXImageToVideoPipeline, CogVideoXDPMScheduler

pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "THUDM/CogVideoX-5b-I2V",
    torch_dtype=torch.bfloat16,
)

# Configure scheduler
pipe.scheduler = CogVideoXDPMScheduler.from_config(
    pipe.scheduler.config, timestep_spacing="trailing"
)

# For high-VRAM GPUs, use direct GPU placement instead of CPU offloading
pipe.to("cuda")
pipe.vae.enable_slicing()
pipe.vae.enable_tiling()

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment