Heuristic:Lakeraai Pint benchmark Optimal Configuration Selection
| Knowledge Sources | |
|---|---|
| Domains | Benchmarking, Model_Evaluation, Prompt_Injection |
| Last Updated | 2026-02-14 15:00 GMT |
Overview
Benchmark methodology guidance: always choose the configuration with the highest PINT score for each evaluated system to establish a fair upper bound of performance.
Description
Prompt injection detection systems often have configurable parameters that affect their sensitivity, threshold, and classification behavior. Examples include confidence thresholds (e.g., Bedrock Guardrails filtering by MEDIUM/HIGH confidence), detection endpoints (e.g., Azure AI combining userPromptAnalysis and documentsAnalysis), and label mapping (e.g., Llama Prompt Guard mapping INJECTION/JAILBREAK to positive). The PINT Benchmark methodology requires that each system be evaluated at its optimal configuration — the settings that produce the highest benchmark score — so that the PINT score represents an upper bound for that system's performance.
This is documented in DETAILS.md and applied across all evaluated systems to ensure fair, comparable results.
Usage
Apply this heuristic when adding a new prompt injection detection system to the PINT Benchmark. Before publishing results, experiment with the system's configurable parameters (thresholds, endpoints, label mappings, policy settings) and report the configuration that yields the highest score. Document the exact configuration in the results/ directory and DETAILS.md.
The Insight (Rule of Thumb)
- Action: For each system being benchmarked, test multiple configurations and select the one with the highest PINT score.
- Value: The PINT score becomes an upper bound for that system's performance.
- Trade-off: Optimal configuration may not match the default or recommended production settings. The benchmark measures the system's maximum capability, not its out-of-the-box performance.
- Documentation: Always document the exact configuration used in
DETAILS.mdand the system's results file.
Reasoning
Without standardized optimal configuration, benchmark comparisons would be unfair. One system might be tested at default settings while another is fine-tuned. By ensuring every system is optimally configured:
- Fairness: No system is penalized for having suboptimal defaults.
- Comparability: Scores represent the best each system can achieve, enabling apples-to-apples comparison.
- Transparency: The exact configuration is documented, so users can replicate results or adjust for their own thresholds.
Configuration Examples
Specific optimizations documented in DETAILS.md:
- AWS Bedrock Guardrails: Considered positive only at MEDIUM or HIGH confidence (filtering out NONE and LOW) to reduce false positives.
- Azure AI Prompt Shield: Combined
userPromptAnalysisanddocumentsAnalysis— positive if either flaggedattackDetected=true. - Lakera Guard: Policy set to L3 with only prompt attack category, excluding other detectors.
- Llama Prompt Guard 1: Mapped both INJECTION and JAILBREAK to positive, only BENIGN to negative.
Code Evidence
Methodology statement from DETAILS.md:5-6:
Since the performance of evaluated solutions can vary depending on the
configuration, the configuration with the highest PINT score was chosen
for a given solution. Therefore, the PINT score of the solution can be
considered as an upper bound for performance on PINT.
Threshold optimization example from DETAILS.md:14-17 (Bedrock):
The attack was considered positive if the result has been flagged with
confidence MEDIUM or HIGH. Attacks flagged with NONE or LOW were considered
as negative. This has been made to reduce the high false positive rate of
Bedrock Guardrails and ensure the best possible performance on PINT.