Implementation:Open compass VLMEvalKit MMHelix Numbrix Eval

Field	Value
source	VLMEvalKit
domain	Vision, Evaluation, Puzzle Solving, Numbrix

Overview

Evaluates Numbrix puzzle solutions in the MMHelix benchmark by verifying number uniqueness, initial state preservation, and consecutive number adjacency.

Description

The `NumbrixEvaluator` class extends `BaseEvaluator` to validate Numbrix puzzle solutions where consecutive numbers must be horizontally or vertically adjacent. It checks three conditions: number uniqueness across the grid, preservation of initial given numbers, and adjacency of consecutive integers. The `_normalize_grid` and `_parse_grid` methods handle various grid input formats, and the evaluator supports an optional verbose mode for debugging.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

Source: vlmeval/dataset/utils/mmhelix/evaluators/numbrix_eval.py, Lines: L1-180
Import: from vlmeval.dataset.utils.mmhelix.evaluators.numbrix_eval import NumbrixEvaluator

Key Functions:

class NumbrixEvaluator(BaseEvaluator):
    def evaluate(self, predicted_answer, ground_truth, initial_state) -> bool: ...
    def _check_number_uniqueness(self, grid): ...
    def _parse_grid(self, grid_str): ...

I/O Contract

Direction	Description
Inputs	Predicted grid string, optional ground truth, and initial state grid with given numbers
Outputs	Boolean indicating whether the solution satisfies all Numbrix constraints

Usage Examples

from vlmeval.dataset.utils.mmhelix.evaluators.numbrix_eval import NumbrixEvaluator

evaluator = NumbrixEvaluator()
is_correct = evaluator.evaluate(predicted, ground_truth, initial_state)

Related Pages

Principle:Open_compass_VLMEvalKit_Benchmark_Dataset_Construction

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment