Principle:InternLM Lmdeploy Image Loading

Knowledge Sources	VLM Pipeline LMDeploy
Domains	Vision_Language_Models, Image_Processing
Last Updated	2026-02-07 15:00 GMT

Overview

A multi-source image loading mechanism that normalizes images from URLs, local files, base64 data URIs, and PIL objects into a uniform format for VLM processing.

Description

Image Loading provides a unified interface for loading images from heterogeneous sources:

HTTP/HTTPS URLs: Downloads images from the web
Local file paths: Reads images from disk
Base64 data URIs: Decodes inline base64-encoded images
PIL Image objects: Passes through directly

All images are converted to RGB mode PIL Images, providing a consistent format for downstream vision encoding regardless of source.

Usage

Use this when preparing image inputs for VLM inference. The function is called before passing images to the Pipeline with prompt-image tuples or OpenAI-format messages.

Theoretical Basis

Image loading implements a Union Type Resolution pattern:

# Abstract image loading
def load_image(source):
    if isinstance(source, PIL.Image):
        return source.convert('RGB')
    elif source.startswith('http'):
        return download_and_open(source).convert('RGB')
    elif source.startswith('data:image'):
        return decode_base64(source).convert('RGB')
    else:
        return open_file(source).convert('RGB')

Related Pages

Implemented By

Implementation:InternLM_Lmdeploy_Load_Image

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment