Implementation:Protectai Llm guard Input InvisibleText

Knowledge Sources	Protectai_Llm_guard
Domains	Security, Unicode_Detection
Last Updated	2026-02-14 12:00 GMT

Overview

The InvisibleText scanner detects and removes invisible Unicode characters from prompts, preventing hidden text injection attacks.

Description

InvisibleText is a lightweight input scanner that identifies invisible Unicode characters belonging to the Cf (Format), Co (Private Use), and Cn (Unassigned) Unicode categories. These invisible characters can be used by attackers to embed hidden instructions or payloads within seemingly innocuous prompts -- a technique known as invisible text injection or Unicode smuggling. The scanner uses Python's built-in unicodedata module and does not require any ML model, making it extremely fast and lightweight. When invisible characters are detected, they are removed from the prompt and the prompt is flagged as invalid. The static method contains_unicode can be used independently to check for invisible characters without scanning.

Usage

Use the InvisibleText scanner as a first-line defense against Unicode-based prompt injection attacks. This scanner should be included in most scanning pipelines due to its minimal overhead and its ability to catch a class of attacks that other scanners might miss. It is particularly important in security-sensitive applications where adversarial inputs are a concern.

Code Reference

Source Location

Repository: Protectai_Llm_guard
File: llm_guard/input_scanners/invisible_text.py
Lines: 1-45

Signature

class InvisibleText(Scanner):
    def __init__(self) -> None: ...

    def scan(self, prompt: str) -> tuple[str, bool, float]: ...

    @staticmethod
    def contains_unicode(text: str) -> bool: ...

Import

from llm_guard.input_scanners import InvisibleText

I/O Contract

Inputs

Name	Type	Required	Description
No constructor parameters required.

scan() Inputs

Name	Type	Required	Description
prompt	str	Yes	The input text to scan for invisible Unicode characters.

Outputs

Name	Type	Description
prompt	str	The cleaned prompt with all invisible Unicode characters removed.
is_valid	bool	True if no invisible characters were found; False if invisible characters were detected and removed.
risk_score	float	1.0 if invisible characters were found; 0.0 otherwise.

contains_unicode()

Name	Type	Description
text	str	The text to check for invisible Unicode characters.
return	bool	True if invisible Unicode characters are present; False otherwise.

Unicode Categories Detected

Category Code	Category Name	Description
Cf	Format	Invisible formatting characters such as zero-width spaces, zero-width joiners, and directional markers.
Co	Private Use	Characters in the Unicode Private Use Areas that have no standard visible representation.
Cn	Unassigned	Unicode code points that have not been assigned to any character.

Usage Examples

Basic Usage

from llm_guard.input_scanners import InvisibleText

scanner = InvisibleText()
# Prompt containing a zero-width space (U+200B)
prompt = "Hello\u200B World"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)

print(sanitized_prompt)  # "Hello World" (invisible character removed)
print(is_valid)          # False (invisible characters were found)
print(risk_score)        # 1.0

Static Check Without Scanning

from llm_guard.input_scanners import InvisibleText

# Quick check without full scanning
text = "Normal text without hidden characters"
has_invisible = InvisibleText.contains_unicode(text)
print(has_invisible)  # False

text_with_hidden = "Hidden\u200Btext\u200Bhere"
has_invisible = InvisibleText.contains_unicode(text_with_hidden)
print(has_invisible)  # True

Pipeline Integration

from llm_guard.input_scanners import InvisibleText

# Use as a lightweight first-pass scanner
scanner = InvisibleText()
prompt = "Summarize this document for me"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)

if is_valid:
    print("Prompt is clean, proceeding to LLM")
else:
    print("Invisible characters detected and removed")

Related Pages

Principle:Protectai_Llm_guard_Invisible_Text_Detection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment