Implementation:Open compass VLMEvalKit SArena DINO Score

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Image Generation, DINO

Overview

Calculates DINOv2-based image similarity scores for image-to-image evaluation in the SArena benchmark.

Description

The `DINOScoreCalculator` class extends `BaseMetric` to compute visual similarity using the DINOv2 base model (`facebook/dinov2-base`). It extracts feature embeddings from both the ground truth and predicted images via the model's last hidden state, then computes cosine similarity normalized to a 0-1 range. Supports PIL Image, file path, and tensor inputs.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

Source: vlmeval/dataset/utils/SArena/DINO_Score.py, Lines: L1-53
Import: from vlmeval.dataset.utils.SArena.DINO_Score import DINOScoreCalculator

Key Functions:

class DINOScoreCalculator(BaseMetric):
    def calculate_DINOv2_similarity_score(self, **kwargs): ...
    def process_input(self, image, processor): ...

I/O Contract

Direction	Description
Inputs	Keyword arguments 'gt_im' (ground truth image) and 'pred_im' (predicted image) as PIL Images, file paths, or tensors
Outputs	Float similarity score between 0 and 1

Usage Examples

from vlmeval.dataset.utils.SArena.DINO_Score import DINOScoreCalculator

calc = DINOScoreCalculator()
score = calc.calculate_DINOv2_similarity_score(gt_im=gt_image, pred_im=pred_image)

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment