Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Protectai Llm guard PII Deanonymization

From Leeroopedia
Knowledge Sources
Domains NLP, Data_Privacy, Text_Processing
Last Updated 2026-02-14 12:00 GMT

Overview

A placeholder-to-original replacement technique that restores anonymized entities in LLM outputs using stored vault mappings, with support for exact, case-insensitive, and fuzzy matching strategies.

Description

PII deanonymization is the reverse process of anonymization: it replaces placeholder tokens in LLM-generated text with the original values stored in a Vault. Since LLMs may alter placeholder formatting (e.g., changing case, introducing typos, or rephrasing), multiple matching strategies are supported:

  • Exact matching: Direct string replacement of placeholders.
  • Case-insensitive matching: Regex-based replacement ignoring letter case.
  • Fuzzy matching: Uses Levenshtein distance to find near-matches when the LLM slightly modifies placeholders.
  • Combined exact+fuzzy: Applies exact matching first, then fuzzy matching for any remaining unmatched placeholders.

Usage

Use this principle after receiving an LLM response that was generated from an anonymized prompt. It is always paired with a prior anonymization step and requires access to the same Vault instance that was populated during anonymization.

Theoretical Basis

The matching strategies follow a hierarchy of strictness:

# Pseudocode for deanonymization strategies
def deanonymize(text, vault_items, strategy):
    if strategy == EXACT:
        for placeholder, original in vault_items:
            text = text.replace(placeholder, original)
    elif strategy == CASE_INSENSITIVE:
        for placeholder, original in vault_items:
            text = re.sub(placeholder, original, text, flags=re.IGNORECASE)
    elif strategy == FUZZY:
        for placeholder, original in vault_items:
            matches = find_near_matches(placeholder, text, max_l_dist=3)
            text = replace_matches(text, matches, original)
    elif strategy == COMBINED:
        text = exact_match(text, vault_items)
        text = fuzzy_match(text, vault_items)
    return text

The fuzzy matching uses a maximum Levenshtein distance of 3 to balance between catching LLM-induced modifications and avoiding false positive replacements.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment