Implementation:Open compass VLMEvalKit MMHelix Numbrix Eval
| Field | Value |
|---|---|
| source | VLMEvalKit |
| domain | Vision, Evaluation, Puzzle Solving, Numbrix |
Overview
Evaluates Numbrix puzzle solutions in the MMHelix benchmark by verifying number uniqueness, initial state preservation, and consecutive number adjacency.
Description
The `NumbrixEvaluator` class extends `BaseEvaluator` to validate Numbrix puzzle solutions where consecutive numbers must be horizontally or vertically adjacent. It checks three conditions: number uniqueness across the grid, preservation of initial given numbers, and adjacency of consecutive integers. The `_normalize_grid` and `_parse_grid` methods handle various grid input formats, and the evaluator supports an optional verbose mode for debugging.
Usage
Called internally by the corresponding dataset class during evaluation.
Code Reference
- Source:
vlmeval/dataset/utils/mmhelix/evaluators/numbrix_eval.py, Lines: L1-180 - Import:
from vlmeval.dataset.utils.mmhelix.evaluators.numbrix_eval import NumbrixEvaluator
Key Functions:
class NumbrixEvaluator(BaseEvaluator):
def evaluate(self, predicted_answer, ground_truth, initial_state) -> bool: ...
def _check_number_uniqueness(self, grid): ...
def _parse_grid(self, grid_str): ...
I/O Contract
| Direction | Description |
|---|---|
| Inputs | Predicted grid string, optional ground truth, and initial state grid with given numbers |
| Outputs | Boolean indicating whether the solution satisfies all Numbrix constraints |
Usage Examples
from vlmeval.dataset.utils.mmhelix.evaluators.numbrix_eval import NumbrixEvaluator
evaluator = NumbrixEvaluator()
is_correct = evaluator.evaluate(predicted, ground_truth, initial_state)