Implementation:FlagOpen FlagEmbedding BGE VL Eval CIRCO

Knowledge Sources	FlagOpen_FlagEmbedding
Domains	Vision-Language Retrieval, Composed Image Retrieval, Evaluation
Last Updated	2026-02-09 00:00 GMT

Overview

Evaluation script for BGE-VL model on the CIRCO (Composed Image Retrieval on Common Objects) benchmark.

Description

This module implements the evaluation pipeline for the BGE-VL vision-language embedding model on the CIRCO dataset, which tests composed image retrieval capabilities. It encodes image corpus using vision encoder, performs multimodal query encoding (image + text) for composed queries, creates FAISS index for efficient similarity search, retrieves top-k similar images, and formats results for CIRCO evaluation metrics. The script supports GPU-accelerated FAISS indexing, optional embedding caching to disk, and batch processing for efficient encoding of large image collections.

Usage

Use this script when evaluating BGE-VL model performance on composed image retrieval tasks, benchmarking vision-language models on the CIRCO dataset, and testing multimodal query understanding (image modification via text). The CIRCO task requires finding target images based on a reference image and a text description of desired changes.

Code Reference

Source Location

Repository: FlagOpen_FlagEmbedding
File: research/BGE_VL/eval/eval_Circo.py
Lines: 1-225

Signature

def index(
    model: Flag_mmret,
    corpus: datasets.Dataset,
    batch_size: int = 256,
    max_length: int=512,
    index_factory: str = "Flat",
    save_path: str = None,
    save_embedding: bool = False,
    load_embedding: bool = False
):
    """Index corpus images into FAISS index"""

def search(
    model: Flag_mmret,
    queries: datasets,
    faiss_index: faiss.Index,
    k:int = 100,
    batch_size: int = 256,
    max_length: int=512
):
    """Search for similar images using multimodal queries"""

def main():
    """Run CIRCO evaluation pipeline"""

Import

# Run as script with arguments
# python eval_Circo.py --model_name BAAI/BGE-VL-large --result_save_path results.json

I/O Contract

Inputs

Name	Type	Required	Description
model_name	str	Yes	BGE-VL model name or path
result_save_path	str	Yes	Path to save retrieval results JSON
image_dir	str	Yes	Directory containing COCO images
batch_size	int	No	Batch size for encoding (default: 256)
max_query_length	int	No	Max query text length (default: 64)
max_passage_length	int	No	Max passage length (default: 77)
k	int	No	Number of neighbors to retrieve (default: 100)
save_embedding	bool	No	Save embeddings to disk (default: False)
load_embedding	bool	No	Load embeddings from disk (default: False)

Outputs

Name	Type	Description
results	dict	JSON file mapping query IDs to top-50 retrieved image IDs

Usage Examples

# Example 1: Basic CIRCO evaluation
# python eval_Circo.py \
#   --model_name BAAI/BGE-VL-large \
#   --result_save_path ./eval/mmret_large_circo.json \
#   --image_dir /path/to/coco_images \
#   --batch_size 256 \
#   --k 100

# Example 2: With embedding caching
# python eval_Circo.py \
#   --model_name BAAI/BGE-VL-large \
#   --result_save_path ./results.json \
#   --image_dir /path/to/coco_images \
#   --save_embedding \
#   --save_path ./embeddings.memmap

# Later, reload cached embeddings
# python eval_Circo.py \
#   --model_name BAAI/BGE-VL-large \
#   --result_save_path ./results.json \
#   --image_dir /path/to/coco_images \
#   --load_embedding \
#   --save_path ./embeddings.memmap

# Example 3: Programmatic usage
from transformers import HfArgumentParser
from eval_Circo import Args, main
import datasets

args = Args(
    model_name="BAAI/BGE-VL-large",
    result_save_path="./circo_results.json",
    image_dir="/path/to/coco",
    batch_size=128,
    k=50
)

# Run evaluation (will read from ./eval/data/circo_query.jsonl and circo_corpus.jsonl)
main()

# Results format: {"query_id": ["img1.jpg", "img2.jpg", ...], ...}

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment