Implementation:Open compass VLMEvalKit Encode Image To Base64

Field	Value
source	VLMEvalKit
domain	Vision, Data_Processing

Overview

Concrete tool for encoding PIL images to base64 strings with size constraints for TSV storage provided by VLMEvalKit.

Description

encode_image_to_base64() in vlmeval/smp/vlm.py (L99-139) converts a PIL Image to a base64 string. Handles RGBA/P/LA mode conversion to RGB.

Supports:

target_size — for thumbnailing
VLMEVAL_MAX_IMAGE_SIZE env var — for max encoded size (progressive 0.7x downscaling until under limit)
VLMEVAL_MIN_IMAGE_EDGE env var — for minimum edge length (upscaling)

Also provides decode_base64_to_image() (L147-154) for the reverse operation and encode_image_file_to_base64() (L142-144) for file-based encoding.

Usage

Use when preparing benchmark TSV files or when API wrappers need to send images as base64.

Code Reference

Source: vlmeval/smp/vlm.py, Lines: L99-139 (encode), L147-154 (decode)

Signature:

def encode_image_to_base64(
    img: PIL.Image.Image,
    target_size: int = -1,   # Max thumbnail dimension (-1 = no resize)
    fmt: str = 'JPEG'        # Image format
) -> str:
    """
    Encodes PIL Image to base64 string.
    Respects VLMEVAL_MAX_IMAGE_SIZE and VLMEVAL_MIN_IMAGE_EDGE env vars.
    """

def decode_base64_to_image(
    base64_string: str,
    target_size: int = -1
) -> PIL.Image.Image:
    """Decodes base64 string back to PIL Image."""

Import:

from vlmeval.smp import encode_image_to_base64, decode_base64_to_image

I/O Contract

Direction	Name	Type	Description
Input	img	PIL.Image.Image	The image to encode
Input	target_size	int	Max thumbnail dimension (-1 = no resize)
Input	fmt	str	Image format (default: 'JPEG')
Output	(return)	str	Base64 encoded string

Usage Examples

from PIL import Image
from vlmeval.smp import encode_image_to_base64, decode_base64_to_image

# Encode an image
img = Image.open("photo.jpg")
b64_str = encode_image_to_base64(img, target_size=512)

# Decode it back
decoded_img = decode_base64_to_image(b64_str)
decoded_img.save("decoded.jpg")

# For TSV preparation
import pandas as pd
data = []
for img_path, question, answer in benchmark_items:
    img = Image.open(img_path)
    b64 = encode_image_to_base64(img)
    data.append({"index": len(data), "image": b64, "question": question, "answer": answer})
df = pd.DataFrame(data)
df.to_csv("my_benchmark.tsv", sep="\t", index=False)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment