Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Protectai Llm guard Input EmotionDetection

From Leeroopedia
Knowledge Sources
Domains Emotion_Detection, NLP
Last Updated 2026-02-14 12:00 GMT

Overview

The EmotionDetection scanner detects emotions in text using a RoBERTa GoEmotions model and blocks prompts containing configurable negative emotions.

Description

EmotionDetection is an input scanner that analyzes the emotional content of prompts using the SamLowe/roberta-base-go_emotions model, which is trained on Google's GoEmotions dataset. The model classifies text across 28 emotion labels including admiration, amusement, anger, annoyance, approval, caring, confusion, curiosity, desire, disappointment, disapproval, disgust, embarrassment, excitement, fear, gratitude, grief, joy, love, nervousness, optimism, pride, realization, relief, remorse, sadness, surprise, and neutral. By default, the scanner blocks 11 negative emotions: anger, annoyance, disappointment, disapproval, disgust, embarrassment, fear, grief, nervousness, remorse, and sadness. The blocked_emotions list is fully customizable. The scanner supports both FULL text matching and sentence-level analysis via the match_type parameter. An optional return_full_output mode provides detailed emotion scores through the scan_with_full_output method.

Usage

Use the EmotionDetection scanner when you need to filter prompts based on emotional tone. This is useful for customer-facing chatbots where you want to detect and handle negative emotions proactively, for mental health applications requiring emotional awareness, or for content moderation to prevent hostile or distressed interactions.

Code Reference

Source Location

Signature

class EmotionDetection(Scanner):
    def __init__(
        self,
        *,
        model: Model | None = None,              # default: SamLowe/roberta-base-go_emotions
        threshold: float = 0.5,
        blocked_emotions: List[str] | None = None,  # default: 11 negative emotions
        match_type: MatchType | str = MatchType.FULL,
        use_onnx: bool = False,
        return_full_output: bool = False,
    ) -> None: ...

    def scan(self, prompt: str) -> tuple[str, bool, float]: ...

    def get_emotion_analysis(self, prompt: str) -> Dict[str, float]: ...

    def scan_with_full_output(self, prompt: str) -> tuple[str, bool, float, Dict[str, float]]: ...

Import

from llm_guard.input_scanners.emotion_detection import EmotionDetection

I/O Contract

Inputs

Name Type Required Description
model Model or None No The emotion classification model. Defaults to SamLowe/roberta-base-go_emotions.
threshold float No Minimum confidence score for an emotion to be considered detected. Defaults to 0.5.
blocked_emotions List[str] or None No List of emotion labels to block. Defaults to 11 negative emotions: anger, annoyance, disappointment, disapproval, disgust, embarrassment, fear, grief, nervousness, remorse, sadness.
match_type MatchType or str No Whether to analyze the full text or individual sentences. Defaults to MatchType.FULL.
use_onnx bool No Whether to use ONNX runtime for inference. Defaults to False.
return_full_output bool No Whether scan_with_full_output returns detailed emotion scores. Defaults to False.

scan() Inputs

Name Type Required Description
prompt str Yes The input text to analyze for emotional content.

Outputs

Name Type Description
prompt str The original prompt (unchanged).
is_valid bool True if no blocked emotions were detected above the threshold; False otherwise.
risk_score float The highest confidence score among detected blocked emotions.

scan_with_full_output() Additional Output

Name Type Description
emotion_scores Dict[str, float] Dictionary mapping all 28 emotion labels to their confidence scores.

Emotion Labels

The model classifies text across 28 emotions:

Positive Negative Neutral/Ambiguous
admiration, amusement, approval, caring, curiosity, desire, excitement, gratitude, joy, love, optimism, pride, relief anger, annoyance, disappointment, disapproval, disgust, embarrassment, fear, grief, nervousness, remorse, sadness confusion, realization, surprise, neutral

Usage Examples

Basic Usage

from llm_guard.input_scanners.emotion_detection import EmotionDetection

scanner = EmotionDetection()
prompt = "I am absolutely furious about this terrible service!"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)

print(is_valid)    # False (anger detected)
print(risk_score)  # Confidence score for the detected emotion

Custom Blocked Emotions

from llm_guard.input_scanners.emotion_detection import EmotionDetection

# Only block specific emotions
scanner = EmotionDetection(
    blocked_emotions=["anger", "disgust", "fear"],
    threshold=0.6,
)
prompt = "I'm a bit disappointed but overall okay"
sanitized_prompt, is_valid, risk_score = scanner.scan(prompt)

print(is_valid)    # True (disappointment not in blocked list)

Full Emotion Analysis

from llm_guard.input_scanners.emotion_detection import EmotionDetection

scanner = EmotionDetection(return_full_output=True)
prompt = "This is amazing and I love it!"

# Get detailed emotion scores
emotion_scores = scanner.get_emotion_analysis(prompt)
for emotion, score in sorted(emotion_scores.items(), key=lambda x: x[1], reverse=True):
    if score > 0.1:
        print(f"{emotion}: {score:.3f}")

# Or use scan_with_full_output
sanitized_prompt, is_valid, risk_score, scores = scanner.scan_with_full_output(prompt)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment