Principle:Protectai Llm guard Competitor Name Detection
| Knowledge Sources | |
|---|---|
| Domains | Content_Filtering, NER |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
Detecting organization names in text using Named Entity Recognition and matching against a user-defined competitor list.
Description
Competitor Name Detection is a content filtering principle that identifies mentions of specific organizations within text by combining Named Entity Recognition (NER) with list-based matching. In many business contexts, it is essential to prevent a language model from mentioning, recommending, or discussing competitor companies, whether in response to user queries or in generated content.
The principle operates in two stages. First, an NER model (such as NuNER) scans the text to extract all entities classified as organizations (ORG type). This extraction step leverages transformer-based sequence labeling to identify entity boundaries with high accuracy. Second, the extracted organization names are matched against a user-provided competitor list to determine whether any competitor is mentioned.
For long texts, the input is processed in overlapping chunks to ensure that entity mentions spanning chunk boundaries are not missed. The overlap region guarantees that an entity appearing at the edge of one chunk will be fully captured by the adjacent chunk, preventing boundary-related false negatives.
Usage
Use this principle when deploying language models in commercial contexts where brand safety is a concern. Typical applications include customer support chatbots that should not recommend competitor products, marketing content generators that must avoid mentioning rival brands, and enterprise assistants that should steer conversations away from competitor solutions. The user-defined competitor list allows flexible configuration for different business contexts.
Theoretical Basis
The detection pipeline operates as follows:
Stage 1: Named Entity Recognition
- Divide input text into overlapping chunks of configurable size
- For each chunk, run the NER model to extract entity spans
- Filter entities to retain only those with the ORG label
- Merge duplicate entities from overlapping regions using span deduplication
Stage 2: Competitor Matching
- Normalize extracted entity names (lowercasing, whitespace normalization)
- Compare each extracted ORG entity against the user-defined competitor list
- Matching uses string comparison with normalization to handle minor variations
Decision Logic:
- If any extracted entity matches a competitor name, the text is flagged
- Matched competitor names can optionally be redacted in the output
- The scanner returns both the validation result and the list of detected competitors