Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Implementation:Zai org CogVideo Load Image

From Leeroopedia


Metadata

Field Value
Page Type Implementation (Wrapper Doc)
Knowledge Sources Repo (CogVideo), Paper (CogVideoX)
Domains Video_Generation, Diffusion_Models, Image_Conditioning
Last Updated 2026-02-10 00:00 GMT

Overview

Concrete tool for loading conditioning images for the CogVideoX I2V pipeline provided by the diffusers utility library.

Description

load_image is a utility function from the diffusers library that loads an image from either a local file path or a remote URL and returns it as a PIL Image object. This function handles image format detection, decoding, and conversion automatically, providing a standardized image input for the I2V pipeline.

The function supports common image formats including JPEG, PNG, BMP, and WebP. When provided with a URL, it downloads the image content and decodes it in memory. For local file paths, it reads directly from disk.

Usage

Import load_image from diffusers.utils and call it with a file path or URL string. The returned PIL Image is passed directly to the I2V pipeline's image parameter.

Code Reference

Source Location

inference/cli_demo.py, line 120.

Signature

image = load_image(
    image,   # str: file path or URL to the conditioning image
)
# Returns: PIL.Image.Image

Import

from diffusers.utils import load_image

I/O Contract

Inputs

Parameter Type Required Description
image str Yes A local file path or remote URL pointing to the conditioning image. Supported formats include JPEG, PNG, BMP, and WebP.

Outputs

Output Type Description
Image object PIL.Image.Image A PIL Image object in RGB mode, ready to be passed as the image argument to the I2V pipeline.

Usage Examples

Loading an Image from a Local File

from diffusers.utils import load_image

image = load_image("/path/to/reference_image.png")

Loading an Image from a URL

from diffusers.utils import load_image

image = load_image("https://example.com/reference_image.jpg")

Full I2V Workflow with Image Loading

import torch
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import load_image

# Load the pipeline
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "THUDM/CogVideoX-5b-I2V",
    torch_dtype=torch.bfloat16,
)

# Load the conditioning image
image = load_image("/path/to/first_frame.png")

# Pass to pipeline (after configuration)
# output = pipe(prompt="...", image=image, ...)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment