Implementation:FlagOpen FlagEmbedding BGE VL Eval CIRCO
| Knowledge Sources | |
|---|---|
| Domains | Vision-Language Retrieval, Composed Image Retrieval, Evaluation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Evaluation script for BGE-VL model on the CIRCO (Composed Image Retrieval on Common Objects) benchmark.
Description
This module implements the evaluation pipeline for the BGE-VL vision-language embedding model on the CIRCO dataset, which tests composed image retrieval capabilities. It encodes image corpus using vision encoder, performs multimodal query encoding (image + text) for composed queries, creates FAISS index for efficient similarity search, retrieves top-k similar images, and formats results for CIRCO evaluation metrics. The script supports GPU-accelerated FAISS indexing, optional embedding caching to disk, and batch processing for efficient encoding of large image collections.
Usage
Use this script when evaluating BGE-VL model performance on composed image retrieval tasks, benchmarking vision-language models on the CIRCO dataset, and testing multimodal query understanding (image modification via text). The CIRCO task requires finding target images based on a reference image and a text description of desired changes.
Code Reference
Source Location
- Repository: FlagOpen_FlagEmbedding
- File: research/BGE_VL/eval/eval_Circo.py
- Lines: 1-225
Signature
def index(
model: Flag_mmret,
corpus: datasets.Dataset,
batch_size: int = 256,
max_length: int=512,
index_factory: str = "Flat",
save_path: str = None,
save_embedding: bool = False,
load_embedding: bool = False
):
"""Index corpus images into FAISS index"""
def search(
model: Flag_mmret,
queries: datasets,
faiss_index: faiss.Index,
k:int = 100,
batch_size: int = 256,
max_length: int=512
):
"""Search for similar images using multimodal queries"""
def main():
"""Run CIRCO evaluation pipeline"""
Import
# Run as script with arguments
# python eval_Circo.py --model_name BAAI/BGE-VL-large --result_save_path results.json
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| model_name | str | Yes | BGE-VL model name or path |
| result_save_path | str | Yes | Path to save retrieval results JSON |
| image_dir | str | Yes | Directory containing COCO images |
| batch_size | int | No | Batch size for encoding (default: 256) |
| max_query_length | int | No | Max query text length (default: 64) |
| max_passage_length | int | No | Max passage length (default: 77) |
| k | int | No | Number of neighbors to retrieve (default: 100) |
| save_embedding | bool | No | Save embeddings to disk (default: False) |
| load_embedding | bool | No | Load embeddings from disk (default: False) |
Outputs
| Name | Type | Description |
|---|---|---|
| results | dict | JSON file mapping query IDs to top-50 retrieved image IDs |
Usage Examples
# Example 1: Basic CIRCO evaluation
# python eval_Circo.py \
# --model_name BAAI/BGE-VL-large \
# --result_save_path ./eval/mmret_large_circo.json \
# --image_dir /path/to/coco_images \
# --batch_size 256 \
# --k 100
# Example 2: With embedding caching
# python eval_Circo.py \
# --model_name BAAI/BGE-VL-large \
# --result_save_path ./results.json \
# --image_dir /path/to/coco_images \
# --save_embedding \
# --save_path ./embeddings.memmap
# Later, reload cached embeddings
# python eval_Circo.py \
# --model_name BAAI/BGE-VL-large \
# --result_save_path ./results.json \
# --image_dir /path/to/coco_images \
# --load_embedding \
# --save_path ./embeddings.memmap
# Example 3: Programmatic usage
from transformers import HfArgumentParser
from eval_Circo import Args, main
import datasets
args = Args(
model_name="BAAI/BGE-VL-large",
result_save_path="./circo_results.json",
image_dir="/path/to/coco",
batch_size=128,
k=50
)
# Run evaluation (will read from ./eval/data/circo_query.jsonl and circo_corpus.jsonl)
main()
# Results format: {"query_id": ["img1.jpg", "img2.jpg", ...], ...}