Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Protectai Llm guard Secret Detection

From Leeroopedia
Knowledge Sources
Domains Security, Secret_Detection
Last Updated 2026-02-14 12:00 GMT

Overview

Detecting API keys, tokens, passwords, and other secrets in text using pattern matching with entropy analysis.

Description

Secret Detection is a security principle concerned with identifying sensitive credentials embedded within text before they are processed by or returned from a language model. Secrets such as API keys, authentication tokens, passwords, and private keys can inadvertently appear in prompts or model outputs, creating serious security vulnerabilities if exposed.

This principle combines two complementary detection strategies. First, regex-based pattern matching compares text against a library of known secret formats covering 80+ providers including cloud platforms, version control systems, messaging services, and payment processors. Each provider has specific token formats (prefixes, lengths, character sets) that can be reliably matched. Second, entropy-based detection identifies high-randomness strings that may represent secrets even when they do not match any known pattern. Strings with high Shannon entropy are statistically unlikely to be natural language and are flagged as potential secrets.

The combination of these two approaches provides breadth (known patterns) and depth (unknown but suspicious strings), creating a robust detection layer.

Usage

Use this principle when building systems that handle user-generated prompts or model-generated responses where sensitive credentials could appear. It is particularly important in enterprise contexts where API keys or database credentials might be inadvertently pasted into chat interfaces, or where model outputs might hallucinate or regurgitate secrets from training data. Apply this principle to both input scanning (preventing secrets from reaching the model) and output scanning (preventing secrets from being returned to users).

Theoretical Basis

The detection algorithm operates in two phases:

Phase 1: Pattern Matching

  • Maintain a registry of regex patterns for known secret formats (e.g., ghp_[a-zA-Z0-9]{36} for GitHub personal access tokens)
  • Scan the input text against each pattern in the registry
  • Each match is tagged with the provider name and confidence level

Phase 2: Entropy Analysis

  • For text segments not matched by known patterns, compute Shannon entropy
  • Shannon entropy H = -sum(p(x) * log2(p(x))) for each character x
  • Strings exceeding a configurable entropy threshold (typically ~4.5 bits/char) are flagged as potential secrets
  • This catches custom or unknown secret formats that exhibit high randomness

Decision Logic:

  • If any pattern match or high-entropy segment is found, the text is flagged
  • Matched segments can optionally be redacted by replacement with placeholder tokens

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment