Principle:Liu00222 Open Prompt Injection Defense Preparation

Knowledge Sources	Open-Prompt-Injection Not What You've Signed Up For
Domains	Prompt_Injection, Security, Defense
Last Updated	2026-02-14 15:00 GMT

Overview

An initialization pattern that pre-loads defense-specific resources (surrogate models, tokenizers, evaluation filters) based on the selected defense strategy before query processing begins.

Description

Defense Preparation handles the upfront loading of resources required by different defense strategies. This is separated from query-time processing because some defenses require expensive initialization: the PPL defense loads a Vicuna-7B surrogate model for perplexity computation, the retokenization defense loads a BPE subword tokenizer, and the response-based defense initializes task-specific evaluation filters. By front-loading this work, the query pipeline can operate efficiently without per-query initialization overhead.

Usage

Use this principle to understand how defenses are initialized. Defense Preparation runs automatically during `Application.__init__` based on the `defense` parameter. No explicit call is needed — selecting a defense string triggers the appropriate resource loading.

Theoretical Basis

The preparation follows a Strategy Initialization pattern where each defense loads its required resources:

Pseudo-code Logic:

# Defense-specific initialization
if defense == 'response-based':
    load_eval_functions()       # Task-specific classifiers
elif defense.startswith('ppl'):
    load_surrogate_model()      # Vicuna-7B for perplexity
    init_perplexity_filter()    # Window and threshold params
elif defense == 'retokenization':
    load_bpe_tokenizer()        # Subword BPE merge table
# Other defenses need no special preparation

Related Pages

Implemented By

Implementation:Liu00222_Open_Prompt_Injection_Application_defense_preparation

Uses Heuristic

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment