Implementation:Open compass VLMEvalKit SArena DINO Score
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Image Generation, DINO |
Overview
Calculates DINOv2-based image similarity scores for image-to-image evaluation in the SArena benchmark.
Description
The `DINOScoreCalculator` class extends `BaseMetric` to compute visual similarity using the DINOv2 base model (`facebook/dinov2-base`). It extracts feature embeddings from both the ground truth and predicted images via the model's last hidden state, then computes cosine similarity normalized to a 0-1 range. Supports PIL Image, file path, and tensor inputs.
Usage
Called internally by the corresponding dataset class during evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/SArena/DINO_Score.py, Lines: L1-53 - Import:
from vlmeval.dataset.utils.SArena.DINO_Score import DINOScoreCalculator
Key Functions:
class DINOScoreCalculator(BaseMetric):
def calculate_DINOv2_similarity_score(self, **kwargs): ...
def process_input(self, image, processor): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Keyword arguments 'gt_im' (ground truth image) and 'pred_im' (predicted image) as PIL Images, file paths, or tensors |
| Outputs | Float similarity score between 0 and 1 |
Usage Examples
from vlmeval.dataset.utils.SArena.DINO_Score import DINOScoreCalculator
calc = DINOScoreCalculator()
score = calc.calculate_DINOv2_similarity_score(gt_im=gt_image, pred_im=pred_image)