Overview
The BanCompetitors scanner detects and optionally redacts competitor company names in prompts using a Named Entity Recognition (NER) model.
Description
BanCompetitors is an input scanner that takes a user-provided list of competitor names and uses NER (Named Entity Recognition) to detect mentions of those competitors within prompts. It uses the guishe/nuner-v1_orgs model by default, which is specialized for organization entity recognition. When a competitor name is found, the scanner can either flag the prompt as invalid or redact the competitor name from the text. The scanner supports chunked processing with configurable chunk_size (default 512 tokens) and chunk_overlap_size (default 40 tokens) to handle long prompts efficiently. The threshold parameter controls the minimum NER confidence required for a match.
Usage
Use the BanCompetitors scanner when you need to prevent users from mentioning competitor organizations in their prompts. This is useful for brand protection, ensuring LLM responses do not reference or compare against competitor products, and for content moderation in customer-facing applications.
Code Reference
Source Location
Signature
class BanCompetitors(Scanner):
def __init__(
self,
competitors: Sequence[str],
*,
threshold: float = 0.5,
redact: bool = True,
model: Model | None = None, # default: guishe/nuner-v1_orgs
use_onnx: bool = False,
chunk_size: int = 512,
chunk_overlap_size: int = 40,
) -> None: ...
def scan(self, prompt: str) -> tuple[str, bool, float]: ...
Import
from llm_guard.input_scanners import BanCompetitors
I/O Contract
Inputs
| Name |
Type |
Required |
Description
|
| competitors |
Sequence[str] |
Yes |
List of competitor names to detect in prompts.
|
| threshold |
float |
No |
Minimum NER confidence score for a match. Defaults to 0.5.
|
| redact |
bool |
No |
Whether to redact detected competitor names from the prompt. Defaults to True.
|
| model |
Model or None |
No |
The NER model to use. Defaults to guishe/nuner-v1_orgs.
|
| use_onnx |
bool |
No |
Whether to use ONNX runtime for inference. Defaults to False.
|
| chunk_size |
int |
No |
Token chunk size for processing long prompts. Defaults to 512.
|
| chunk_overlap_size |
int |
No |
Overlap between chunks to avoid missing entities at boundaries. Defaults to 40.
|
scan() Inputs
| Name |
Type |
Required |
Description
|
| prompt |
str |
Yes |
The input text to scan for competitor mentions.
|
Outputs
| Name |
Type |
Description
|
| prompt |
str |
The prompt with competitor names redacted (if redact=True), or the original prompt.
|
| is_valid |
bool |
True if no competitor names were detected; False otherwise.
|
| risk_score |
float |
A confidence score between 0.0 and 1.0 indicating the likelihood of competitor mention.
|
Usage Examples
Basic Usage
from llm_guard.input_scanners import BanCompetitors
scanner = BanCompetitors(
competitors=["Google", "Microsoft", "Amazon"],
)
prompt = "How does our product compare to Google Cloud?"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)
print(sanitized_prompt) # Competitor name redacted
print(is_valid) # False (competitor detected)
print(risk_score) # Confidence score
Without Redaction
from llm_guard.input_scanners import BanCompetitors
# Detect but do not redact competitor names
scanner = BanCompetitors(
competitors=["Apple", "Samsung"],
redact=False,
threshold=0.6,
)
prompt = "What features does Apple offer that we don't?"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)
print(is_valid) # False (competitor detected)
print(sanitized_prompt) # Original prompt unchanged
Related Pages