Principle:Protectai Llm guard Sensitive Data Detection
| Knowledge Sources | |
|---|---|
| Domains | NLP, Data_Privacy, Output_Validation |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
A PII detection technique applied to LLM outputs that identifies and optionally redacts sensitive data leaked by the model, using NER models and regex patterns.
Description
Even when inputs are properly anonymized, LLMs may generate outputs containing sensitive data from their training data or by inferring PII from context. Sensitive data detection in outputs uses the same NER and regex detection pipeline as input anonymization but applied to the output side.
Key differences from input anonymization:
- No vault interaction: Output detection does not store mappings (it detects, not anonymizes for reversal).
- Optional redaction: When enabled, detected PII is replaced using Presidio's AnonymizerEngine.
- Context-free: Detection operates on the output text independently, not using the prompt for context.
Usage
Use this principle in output scanning pipelines to catch PII that the LLM may have generated or leaked. Essential as a safety net even when input anonymization is in place, because LLMs can generate PII from their training data.
Theoretical Basis
# Pseudocode for sensitive data detection in outputs
entities = analyzer.analyze(output, entity_types, threshold)
if entities:
risk_score = max(entity.score for entity in entities)
if redact:
output = anonymizer.anonymize(output, entities)
return output, SENSITIVE_DATA_FOUND, risk_score
else:
return output, CLEAN, no_risk