Overview
Regex is an output scanner that detects or enforces regex patterns in LLM responses by delegating to the input-side InputRegex scanner.
Description
The Regex output scanner is a thin wrapper around the corresponding input scanner InputRegex. It applies a list of regular expression patterns to the LLM output and determines whether the output should be flagged based on pattern matches. The is_blocked parameter controls the behavior: when True, the scanner flags outputs that match any of the patterns (blocking mode); when False, it flags outputs that do not match any of the patterns (requiring mode). The match_type parameter controls the regex matching strategy (e.g., SEARCH for partial matches). When redact is enabled, matched patterns are replaced in the output text.
Usage
Use this scanner when you need to enforce specific text patterns in LLM outputs. Common use cases include blocking outputs that contain sensitive patterns (e.g., credit card numbers, social security numbers, API keys), ensuring outputs match required formats (e.g., structured response templates), and redacting sensitive information that matches known patterns.
Code Reference
Source Location
Signature
class Regex(Scanner):
def __init__(
self,
patterns: list[str],
*,
is_blocked: bool = True,
match_type: MatchType | str = MatchType.SEARCH,
redact: bool = True,
) -> None: ...
def scan(self, prompt: str, output: str) -> tuple[str, bool, float]: ...
Import
from llm_guard.output_scanners import Regex
I/O Contract
Inputs
| Name |
Type |
Required |
Description
|
| prompt |
str |
Yes |
The input prompt
|
| output |
str |
Yes |
The LLM output to scan against regex patterns
|
Constructor Parameters
| Name |
Type |
Required |
Default |
Description
|
| patterns |
list[str] |
Yes |
N/A |
List of regex patterns to match against the output
|
| is_blocked |
bool |
No |
True |
If True, block matching outputs; if False, require matches
|
| match_type |
str |
No |
MatchType.SEARCH |
Regex matching strategy (SEARCH for partial, FULL_MATCH for complete)
|
| redact |
bool |
No |
True |
Whether to redact matched patterns in the output
|
Outputs
| Name |
Type |
Description
|
| sanitized_output |
str |
The output with matched patterns optionally redacted
|
| is_valid |
bool |
Whether the output passed the scan
|
| risk_score |
float |
Risk score (-1.0 to 1.0)
|
Usage Examples
Basic Usage
from llm_guard.output_scanners import Regex
# Block outputs containing credit card-like patterns
scanner = Regex(
patterns=[r"\b\d{4}[-\s]?\d{4}[-\s]?\d{4}[-\s]?\d{4}\b"],
is_blocked=True,
redact=True,
)
prompt = "What is my account info?"
output = "Your card number is 4532-1234-5678-9012 and your balance is $500."
sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
print(sanitized_output) # Card number will be redacted
print(f"Valid: {is_valid}, Risk: {risk_score}")
Require Pattern Match
from llm_guard.output_scanners import Regex
# Require output to contain a specific format (e.g., ticket number)
scanner = Regex(
patterns=[r"TICKET-\d{6}"],
is_blocked=False,
redact=False,
)
prompt = "Create a support ticket"
output = "Your ticket has been created: TICKET-123456"
sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
# is_valid will be True because the required pattern is present
Related Pages