Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Protectai Llm guard Input Scanner Factory

From Leeroopedia
Knowledge Sources
Domains Factory_Pattern, Configuration
Last Updated 2026-02-14 12:00 GMT

Overview

The get_scanner_by_name factory function instantiates input scanners dynamically by name and optional configuration dictionary.

Description

get_scanner_by_name is a factory function in the llm_guard.input_scanners.util module that provides a convenient way to instantiate any of the 16 available input scanners by their class name. It accepts a scanner_name string and an optional scanner_config dictionary that is unpacked as keyword arguments to the scanner's constructor. This enables dynamic, configuration-driven scanner instantiation, which is useful for building scanning pipelines from configuration files, environment variables, or API parameters. The function validates the scanner name against the known set of scanners and raises a ValueError if an unknown name is provided.

Usage

Use the get_scanner_by_name factory function when you need to instantiate scanners dynamically based on configuration rather than hard-coding scanner classes. This is essential for building configurable scanning pipelines, loading scanner configurations from YAML/JSON files, and creating flexible API endpoints that accept scanner names as parameters.

Code Reference

Source Location

Signature

def get_scanner_by_name(
    scanner_name: str,
    scanner_config: dict | None = None,
) -> Scanner: ...

Import

from llm_guard.input_scanners.util import get_scanner_by_name

I/O Contract

Inputs

Name Type Required Description
scanner_name str Yes The class name of the scanner to instantiate (e.g., "BanCode", "Secrets", "Toxicity").
scanner_config dict or None No Optional dictionary of keyword arguments to pass to the scanner constructor. Defaults to None.

Outputs

Name Type Description
scanner Scanner An instantiated scanner object of the specified type, configured with the provided parameters.

Exceptions

Exception Condition
ValueError Raised when scanner_name does not match any known input scanner.

Supported Scanners

The factory supports all 16 input scanners:

Scanner Name Description
Anonymize Detects and anonymizes PII (personally identifiable information).
BanCode Detects code snippets using CodeNLBERT model.
BanCompetitors Detects and redacts competitor names using NER.
BanSubstrings Blocks prompts containing specified substrings.
BanTopics Blocks prompts about banned topics via zero-shot classification.
Code Detects specific programming languages.
Gibberish Detects nonsensical or gibberish text.
InvisibleText Detects invisible Unicode characters.
Language Validates prompts are in allowed languages.
PromptInjection Detects prompt injection attempts.
Regex Pattern matching with user-defined regular expressions.
Secrets Detects API keys, tokens, and credentials.
Sentiment Detects negative sentiment using VADER.
TokenLimit Enforces token count limits on prompts.
Toxicity Detects toxic or harmful language.

Usage Examples

Basic Usage

from llm_guard.input_scanners.util import get_scanner_by_name

# Instantiate a BanTopics scanner with configuration
scanner = get_scanner_by_name(
    "BanTopics",
    scanner_config={
        "topics": ["violence", "politics"],
        "threshold": 0.6,
    },
)
prompt = "Tell me about the latest election"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)
print(is_valid)

Configuration-Driven Pipeline

from llm_guard.input_scanners.util import get_scanner_by_name

# Define scanners in a configuration dictionary
scanner_configs = {
    "InvisibleText": {},
    "BanSubstrings": {
        "substrings": ["ignore previous instructions"],
        "match_type": "str",
    },
    "Secrets": {
        "redact_mode": "all",
    },
    "Toxicity": {
        "threshold": 0.7,
    },
}

# Build the scanning pipeline dynamically
scanners = []
for name, config in scanner_configs.items():
    scanner = get_scanner_by_name(name, scanner_config=config)
    scanners.append(scanner)

# Run all scanners on a prompt
prompt = "Hello, can you help me with a question?"
for scanner in scanners:
    prompt, is_valid, risk_score = scanner.scan(prompt)
    if not is_valid:
        print(f"Scanner {scanner.__class__.__name__} flagged the prompt")
        break

Error Handling

from llm_guard.input_scanners.util import get_scanner_by_name

try:
    scanner = get_scanner_by_name("NonExistentScanner")
except ValueError as e:
    print(f"Error: {e}")  # Unknown scanner name

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment