Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Open compass VLMEvalKit MMHelix Sudoku Eval

From Leeroopedia
Field Value
source VLMEvalKit
domain Vision, Evaluation, Puzzle Solving, Sudoku

Overview

Evaluates classic 9x9 Sudoku puzzle solutions in the MMHelix benchmark using rule-based validation of rows, columns, and 3x3 blocks.

Description

The `SudokuEvaluator` class extends `BaseEvaluator` to validate 9x9 Sudoku solutions entirely through rules, ignoring ground truth. The `_parse_grid_like` helper function supports multiple input formats: Python list literals, whitespace-delimited grids with '.' as blanks, and nested lists. Validation checks that all given numbers from the initial state are preserved, and that each row, column, and 3x3 block contains digits 1-9 exactly once.

Usage

Called internally by the corresponding dataset class during evaluation.

Code Reference

  • Source: vlmeval/dataset/utils/mmhelix/evaluators/sudoku_evaluator.py, Lines: L1-113
  • Import: from vlmeval.dataset.utils.mmhelix.evaluators.sudoku_evaluator import SudokuEvaluator

Key Functions:

class SudokuEvaluator(BaseEvaluator):
    def evaluate(self, predicted_answer, ground_truth, initial_state) -> bool: ...

def _parse_grid_like(obj) -> Optional[List[List[int]]]: ...

I/O Contract

Direction Description
Inputs Model output with a 9x9 grid (various formats); initial state grid with given numbers
Outputs Boolean indicating whether the grid satisfies all Sudoku rules

Usage Examples

from vlmeval.dataset.utils.mmhelix.evaluators.sudoku_evaluator import SudokuEvaluator

evaluator = SudokuEvaluator()
is_correct = evaluator.evaluate(predicted_grid, None, initial_state)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment