Implementation:Protectai Llm guard Output Scanner Factory
| Knowledge Sources | |
|---|---|
| Domains | Factory_Pattern, Configuration |
| Last Updated | 2026-02-14 12:00 GMT |
Overview
get_scanner_by_name is a factory function that instantiates output scanners by their string name and optional configuration dictionary.
Description
The get_scanner_by_name function is a factory utility for dynamically instantiating output scanners at runtime. It accepts a scanner_name string (matching the class name of the scanner) and an optional scanner_config dictionary containing constructor parameters. The function supports all 22 output scanners available in LLM Guard. Internally, it maintains a mapping from scanner names to their corresponding classes and uses the configuration dictionary to pass keyword arguments to the constructor. This pattern enables configuration-driven scanner setup, where scanner pipelines can be defined in external configuration files (YAML, JSON, etc.) and instantiated without hardcoding imports. If an unknown scanner name is provided, the function raises a ValueError.
Usage
Use this factory function when you need to instantiate output scanners dynamically based on runtime configuration. This is common in applications that allow users or administrators to configure scanner pipelines through configuration files, environment variables, or admin interfaces. It is also useful for building generic scanner orchestration layers that do not need to know about specific scanner types at compile time.
Code Reference
Source Location
- Repository: Protectai_Llm_guard
- File: llm_guard/output_scanners/util.py
- Lines: 1-107
Signature
def get_scanner_by_name(
scanner_name: str,
scanner_config: Optional[Dict] = None,
) -> Scanner: ...
Import
from llm_guard.output_scanners.util import get_scanner_by_name
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| scanner_name | str | Yes | The class name of the output scanner to instantiate |
| scanner_config | Optional[Dict] | No | Dictionary of constructor parameters for the scanner |
Outputs
| Name | Type | Description |
|---|---|---|
| scanner | Scanner | An instantiated output scanner object |
Supported Scanners
The factory function supports all 22 output scanners:
| Scanner Name | Description |
|---|---|
| BanCode | Detects code snippets in outputs |
| BanCompetitors | Detects competitor name mentions |
| BanSubstrings | Detects banned substrings |
| BanTopics | Detects banned topics via zero-shot classification |
| Bias | Detects biased language |
| Code | Detects specific programming languages |
| Deanonymize | Reverses anonymization from input scanners |
| EmotionDetection | Detects emotions in outputs |
| FactualConsistency | Checks factual consistency via NLI |
| Gibberish | Detects nonsensical text |
| JSON | Validates and repairs JSON structures |
| Language | Verifies output is in allowed languages |
| LanguageSame | Ensures output language matches prompt language |
| MaliciousURLs | Classifies URLs as benign or malicious |
| NoRefusal | Detects refusal patterns in outputs |
| NoRefusalLight | Lightweight refusal detection |
| ReadingTime | Enforces maximum reading time |
| Regex | Pattern matching and redaction |
| Relevance | Checks output relevance to prompt |
| Sensitive | Detects sensitive information in outputs |
| Sentiment | Evaluates sentiment polarity |
| Toxicity | Detects toxic language |
| URLReachability | Validates URL reachability |
Usage Examples
Basic Usage
from llm_guard.output_scanners.util import get_scanner_by_name
# Instantiate a scanner by name with configuration
scanner = get_scanner_by_name(
"Toxicity",
{"threshold": 0.8, "match_type": "full"},
)
prompt = "Tell me a story"
output = "Once upon a time there was a kind and gentle soul."
sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
print(f"Valid: {is_valid}, Risk: {risk_score}")
Configuration-Driven Pipeline
from llm_guard.output_scanners.util import get_scanner_by_name
# Define scanner pipeline in configuration
scanner_configs = [
{"name": "Toxicity", "config": {"threshold": 0.7}},
{"name": "BanSubstrings", "config": {"substrings": ["hack", "exploit"]}},
{"name": "Sentiment", "config": {"threshold": -0.1}},
{"name": "ReadingTime", "config": {"max_time": 2.0}},
]
# Instantiate all scanners
scanners = [
get_scanner_by_name(sc["name"], sc.get("config"))
for sc in scanner_configs
]
# Run the pipeline
prompt = "Explain cybersecurity basics"
output = "Cybersecurity involves protecting systems and networks from digital attacks."
for scanner in scanners:
output, is_valid, risk_score = scanner.scan(prompt, output)
if not is_valid:
print(f"Scanner {type(scanner).__name__} flagged the output (risk: {risk_score})")
break
else:
print("Output passed all scanners")