Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Heuristic:Avdvg InjectGuard Sim K Threshold Tuning

From Leeroopedia
Knowledge Sources
Domains Security, Optimization, Anomaly_Detection
Last Updated 2026-02-14 16:00 GMT

Overview

Threshold tuning guidance for the sim_k parameter that controls the precision-recall tradeoff in vector similarity prompt injection detection. Recommended default: 0.98.

Description

The sim_k parameter is the L2 distance threshold used by the sim_search function to classify an input as malicious or benign. When the nearest-neighbor distance is below sim_k, the input is flagged as a prompt injection. Because the embeddings are L2-normalized, the L2 distance range is bounded between 0 (identical) and 2 (maximally dissimilar), making the threshold directly interpretable.

Tuning this single scalar is the primary mechanism for adjusting detection sensitivity in the InjectGuard system.

Usage

Use this heuristic when deploying or evaluating the InjectGuard detection system. The default value of 0.98 provides a conservative balance, but operators should adjust based on their tolerance for false positives (benign inputs blocked) versus false negatives (attacks missed). Run the evaluation harness (main) with different sim_k values to find the optimal operating point for your deployment context.

The Insight (Rule of Thumb)

  • Action: Set sim_k in the config dict passed to main() or directly to sim_search().
  • Recommended Value: 0.98 (author-recommended default).
  • Trade-off:
    • Larger sim_k (e.g., 1.0-1.2): Higher recall (catches more attacks), lower precision (more false positives). The detection net is wider.
    • Smaller sim_k (e.g., 0.8-0.9): Higher precision (fewer false positives), lower recall (misses more attacks). Only near-exact matches flagged.
  • Interpretation: With normalized embeddings, sim_k=0.98 corresponds to a cosine similarity of approximately 1 - (0.98^2)/2 ≈ 0.52, meaning inputs must be moderately similar to a known attack to be flagged.

Reasoning

The author explicitly documents this tradeoff in a code comment: "The larger sim_k, the higher the recall rate and the lower the precision rate." This is consistent with the threshold-based nearest-neighbor classification approach: widening the detection radius around each known malicious prompt catches more true attacks but also catches more benign inputs that happen to be somewhat similar to attack patterns.

The recommended value of 0.98 was chosen by the author as a balanced default. In practice, the optimal threshold depends on:

  • Corpus coverage: A comprehensive malicious prompt dataset allows tighter thresholds (lower sim_k).
  • Application domain: Security-critical applications may prefer higher sim_k (favor recall); user-facing applications may prefer lower sim_k (favor precision to avoid blocking legitimate queries).

Code evidence from vertor_similarity_detection.py:106:

config = {"sim_k":0.98}  # The larger sim_k, the higher the recall rate and the lower the precision rate. The recommended sim_k is 0.98

Detection logic from vertor_similarity_detection.py:66:

detection = 1 if sim_score < sim_k else 0

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment