Overview
Language is an output scanner that verifies LLM responses are written in one of the specified allowed languages by delegating to the input-side InputLanguage scanner.
Description
The Language output scanner is a thin wrapper around the corresponding input scanner InputLanguage. It classifies the language of the LLM output and checks whether it belongs to the list of valid_languages. The scanner uses a language detection model to determine the language of the text and compares the result against the allowed languages. The threshold parameter sets the minimum confidence required for language classification. The match_type parameter controls whether the entire output is evaluated as a whole (FULL) or split into individual sentences (SENTENCE) for per-sentence language detection.
Usage
Use this scanner when your application must respond in specific languages only. This is important for localized applications, compliance with language regulations, and ensuring consistent user experience. For example, a French-language customer service bot should not produce responses in other languages.
Code Reference
Source Location
Signature
class Language(Scanner):
def __init__(
self,
valid_languages: list[str],
*,
model: Model | None = None,
threshold: float = 0.7,
match_type: MatchType | str = MatchType.FULL,
use_onnx: bool = False,
) -> None: ...
def scan(self, prompt: str, output: str) -> tuple[str, bool, float]: ...
Import
from llm_guard.output_scanners import Language
I/O Contract
Inputs
| Name |
Type |
Required |
Description
|
| prompt |
str |
Yes |
The input prompt
|
| output |
str |
Yes |
The LLM output to check for language compliance
|
Constructor Parameters
| Name |
Type |
Required |
Default |
Description
|
| valid_languages |
list[str] |
Yes |
N/A |
List of allowed language codes (e.g., ["en", "fr", "de"])
|
| model |
None |
No |
None |
Custom language detection model
|
| threshold |
float |
No |
0.7 |
Minimum confidence for language classification
|
| match_type |
str |
No |
MatchType.FULL |
Matching strategy: FULL (entire text) or SENTENCE (per-sentence)
|
| use_onnx |
bool |
No |
False |
Whether to use ONNX runtime for inference
|
Outputs
| Name |
Type |
Description
|
| sanitized_output |
str |
The output (potentially modified)
|
| is_valid |
bool |
Whether the output is in one of the valid languages
|
| risk_score |
float |
Risk score (-1.0 to 1.0)
|
Usage Examples
Basic Usage
from llm_guard.output_scanners import Language
scanner = Language(
valid_languages=["en", "fr"],
threshold=0.7,
)
prompt = "Translate this to French"
output = "Bonjour, comment allez-vous?"
sanitized_output, is_valid, risk_score = scanner.scan(prompt, output)
if is_valid:
print("Output is in an allowed language")
else:
print(f"Output language not allowed (risk: {risk_score})")
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.