Implementation:Liu00222 Open Prompt Injection create evaluator

Knowledge Sources	Open-Prompt-Injection
Domains	Evaluation, Metrics
Last Updated	2026-02-14 15:00 GMT

Overview

Concrete factory function for creating an Evaluator that computes prompt injection attack metrics, provided by the OpenPromptInjection evaluator module.

Description

The create_evaluator function instantiates an Evaluator object that automatically computes all four metrics (PNA-T, PNA-I, ASV, MR) in its `__init__` method. It uses task-specific evaluation functions (exact match for classification, ROUGE for summarization, GLEU for grammar correction) and stores results as attributes.

Usage

Call this function at the end of an experiment after collecting all response arrays. The returned Evaluator object has `.pna_t`, `.pna_i`, `.asv`, and `.mr` attributes with computed metric values.

Code Reference

Source Location

Repository: Open-Prompt-Injection
File: OpenPromptInjection/evaluator/__init__.py
Lines: L4-5

Signature

def create_evaluator(target_task_responses, target_task,
                     injected_task_responses, injected_task,
                     attack_responses):
    """
    Factory function to create an Evaluator with computed metrics.

    Args:
        target_task_responses: List/array of target task baseline responses.
        target_task: TargetTask instance (provides labels and dataset name).
        injected_task_responses: List/array of injected task baseline responses (or None).
        injected_task: InjectedTask instance.
        attack_responses: List/array of attack responses.
    Returns:
        Evaluator: Instance with .pna_t, .pna_i, .asv, .mr attributes.
    """
    return Evaluator(target_task_responses, target_task,
                     injected_task_responses, injected_task,
                     attack_responses)

Import

import OpenPromptInjection as PI
# or
from OpenPromptInjection import create_evaluator

I/O Contract

Inputs

Name	Type	Required	Description
target_task_responses	list or ndarray	Yes	Baseline responses on target task (from Step 4)
target_task	TargetTask	Yes	Target task instance with labels
injected_task_responses	list, ndarray, or None	Yes	Baseline responses on injected task (None if defense active)
injected_task	InjectedTask	Yes	Injected task instance with labels
attack_responses	list or ndarray	Yes	Responses to attacked prompts (from Step 6)

Outputs

Name	Type	Description
evaluator.pna_t	float	Prediction accuracy on target task (0.0 to 1.0)
evaluator.pna_i	float or None	Prediction accuracy on injected task (None if no baseline)
evaluator.asv	float	Attack success value (0.0 to 1.0)
evaluator.mr	float or None	Matching rate between attack and injected baseline (None if no baseline)

Usage Examples

Complete Evaluation

import OpenPromptInjection as PI

evaluator = PI.create_evaluator(
    target_task_responses=target_task_responses,
    target_task=target_task,
    injected_task_responses=injected_task_responses,
    injected_task=attacker.task,
    attack_responses=attack_responses
)

print(f"PNA-T = {evaluator.pna_t}")   # Target task accuracy
print(f"PNA-I = {evaluator.pna_i}")   # Injected task accuracy
print(f"ASV = {evaluator.asv}")       # Attack success
print(f"MR = {evaluator.mr}")         # Matching rate

Related Pages

Implements Principle

Principle:Liu00222_Open_Prompt_Injection_Evaluation_Pipeline

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment