Principle:InternLM Lmdeploy Image Loading
| Knowledge Sources | |
|---|---|
| Domains | Vision_Language_Models, Image_Processing |
| Last Updated | 2026-02-07 15:00 GMT |
Overview
A multi-source image loading mechanism that normalizes images from URLs, local files, base64 data URIs, and PIL objects into a uniform format for VLM processing.
Description
Image Loading provides a unified interface for loading images from heterogeneous sources:
- HTTP/HTTPS URLs: Downloads images from the web
- Local file paths: Reads images from disk
- Base64 data URIs: Decodes inline base64-encoded images
- PIL Image objects: Passes through directly
All images are converted to RGB mode PIL Images, providing a consistent format for downstream vision encoding regardless of source.
Usage
Use this when preparing image inputs for VLM inference. The function is called before passing images to the Pipeline with prompt-image tuples or OpenAI-format messages.
Theoretical Basis
Image loading implements a Union Type Resolution pattern:
# Abstract image loading
def load_image(source):
if isinstance(source, PIL.Image):
return source.convert('RGB')
elif source.startswith('http'):
return download_and_open(source).convert('RGB')
elif source.startswith('data:image'):
return decode_base64(source).convert('RGB')
else:
return open_file(source).convert('RGB')