Heuristic:Liu00222 Open Prompt Injection BPE Retokenization Parameters

Knowledge Sources	Open-Prompt-Injection Formalizing and Benchmarking Prompt Injection
Domains	Security, NLP
Last Updated	2026-02-14 15:30 GMT

Overview

BPE retokenization defense uses a 10% dropout rate with up to 10 retry attempts, disrupting injected instructions through stochastic subword segmentation.

Description

The retokenization defense applies Byte Pair Encoding (BPE) with dropout to the user's data prompt before passing it to the LLM. By randomly dropping BPE merges (at a 10% rate), the text is retokenized in a non-standard way that breaks the structure of injected instructions while preserving the meaning of natural text. The defense requires a BPE merge table loaded from `./data/subword_nmt.voc`. Because the stochastic process can occasionally fail, the defense retries up to 10 times before falling back to the original text.

Usage

Use this heuristic when deploying the retokenization defense or understanding its failure modes. The defense is activated by setting `defense='retokenization'` in the application configuration. Key parameters to be aware of are the dropout rate (hardcoded at 0.1) and the retry count (hardcoded at 10).

The Insight (Rule of Thumb)

Action: Set `defense='retokenization'` and ensure the BPE merge table exists at `./data/subword_nmt.voc`.
Value: `bpe_dropout_rate=0.1`, sentinel configuration: `sentinels=[, '</w>']`, regime: `'end'`, bpe_symbol: `'@@'`.
Trade-off: Low dropout (0.1) preserves text readability while slightly disrupting injection structure. Higher dropout would disrupt more but also corrupt legitimate text.
Retry logic: The defense attempts retokenization up to 10 times. If all attempts fail, it logs a warning and returns the original (unprotected) text.
No GPU required: This defense operates purely on CPU string manipulation.

Reasoning

From `apps/Application.py:96-98` (initialization):

elif self.defense == 'retokenization':
    merge_table = load_subword_nmt_table('./data/subword_nmt.voc')
    self.retokenizer = BpeOnlineTokenizer(bpe_dropout_rate=0.1, merge_table=merge_table)

From `apps/Application.py:172-179` (application with retry):

elif self.defense == 'retokenization':
    for _ in range(10):
        try:
            return self.retokenizer(data_prompt, sentinels=['', '</w>'], regime='end', bpe_symbol='@@')
        except:
            continue
    print(f'WARNING: unable to retokenize this sentence')
    return data_prompt

The 10-retry loop with bare `except` catches all exceptions from the BPE tokenizer. This is a defensive pattern to handle edge cases where certain character sequences cause the tokenizer to fail.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment