Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Norrrrrrr lyn WAInjectBench Image Dataset Format

From Leeroopedia
Knowledge Sources
Domains Data_Engineering, Computer_Vision
Last Updated 2026-02-14 16:00 GMT

Overview

A folder-based data organization scheme that structures image files by scenario and label for image-based prompt injection detection benchmarks.

Description

Unlike text data which uses JSONL files, image data is organized as a hierarchy of folders. Each scenario (e.g., a specific attack type or benign context) is a subfolder containing numbered image files (e.g., 1.png, 2.jpg). The top-level split into benign/ and malicious/ directories encodes the ground-truth label. The total number of images per folder is counted via folder_path.glob("*"), and detected image IDs are extracted from filenames.

Usage

Use this format when preparing image datasets for the image prompt injection detection pipeline. The --data_dir argument (default "data/image") points to the root directory.

Theoretical Basis

Directory layout:

data/image/
├── benign/
│   ├── scenario_a/        # Contains: 1.png, 2.png, 3.jpg, ...
│   └── scenario_b/
└── malicious/
    ├── attack_x/          # Contains: 1.png, 2.png, ...
    └── attack_y/

Key conventions:

  • Image filenames are numeric (the number becomes the sample ID)
  • Any image format supported by PIL is accepted
  • The parent folder name (benign/malicious) determines metric type (FPR/TPR)
  • Subfolders are discovered via parent_path.iterdir()

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment