Implementation:Zai org CogVideo Load Image

Metadata

Field	Value
Page Type	Implementation (Wrapper Doc)
Knowledge Sources	Repo (CogVideo), Paper (CogVideoX)
Domains	Video_Generation, Diffusion_Models, Image_Conditioning
Last Updated	2026-02-10 00:00 GMT

Overview

Concrete tool for loading conditioning images for the CogVideoX I2V pipeline provided by the diffusers utility library.

Description

load_image is a utility function from the diffusers library that loads an image from either a local file path or a remote URL and returns it as a PIL Image object. This function handles image format detection, decoding, and conversion automatically, providing a standardized image input for the I2V pipeline.

The function supports common image formats including JPEG, PNG, BMP, and WebP. When provided with a URL, it downloads the image content and decodes it in memory. For local file paths, it reads directly from disk.

Usage

Import load_image from diffusers.utils and call it with a file path or URL string. The returned PIL Image is passed directly to the I2V pipeline's image parameter.

Code Reference

Source Location

inference/cli_demo.py, line 120.

Signature

image = load_image(
    image,   # str: file path or URL to the conditioning image
)
# Returns: PIL.Image.Image

Import

from diffusers.utils import load_image

I/O Contract

Inputs

Parameter	Type	Required	Description
`image`	str	Yes	A local file path or remote URL pointing to the conditioning image. Supported formats include JPEG, PNG, BMP, and WebP.

Outputs

Output	Type	Description
Image object	`PIL.Image.Image`	A PIL Image object in RGB mode, ready to be passed as the `image` argument to the I2V pipeline.

Usage Examples

Loading an Image from a Local File

from diffusers.utils import load_image

image = load_image("/path/to/reference_image.png")

Loading an Image from a URL

from diffusers.utils import load_image

image = load_image("https://example.com/reference_image.jpg")

Full I2V Workflow with Image Loading

import torch
from diffusers import CogVideoXImageToVideoPipeline
from diffusers.utils import load_image

# Load the pipeline
pipe = CogVideoXImageToVideoPipeline.from_pretrained(
    "THUDM/CogVideoX-5b-I2V",
    torch_dtype=torch.bfloat16,
)

# Load the conditioning image
image = load_image("/path/to/first_frame.png")

# Pass to pipeline (after configuration)
# output = pipe(prompt="...", image=image, ...)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment