Principle:Norrrrrrr lyn WAInjectBench Validation Checkpoint Selection
| Knowledge Sources | |
|---|---|
| Domains | Evaluation, Model_Selection |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
A model selection strategy that evaluates validation performance after each training epoch and selects the checkpoint with the highest True Positive Rate.
Description
After each training epoch, the model is switched to evaluation mode and run through the validation dataset to compute confusion matrix metrics (TP, TN, FP, FN). From these, TPR and FPR are calculated. The checkpoint with the highest TPR across all epochs is selected as the best model.
This TPR-maximizing strategy reflects the security-oriented nature of prompt injection detection, where missing an attack (false negative) is generally more costly than raising a false alarm (false positive).
Usage
Use this at the end of each training epoch to evaluate model quality and track the best-performing checkpoint for later deployment.
Theoretical Basis
Confusion matrix metrics:
Checkpoint selection criterion:
# Select by maximum TPR
if current_tpr > best_tpr:
best_tpr = current_tpr
save_checkpoint(model, epoch, best_tpr)
This is equivalent to maximizing recall for the positive (malicious) class, which is the appropriate objective when the cost of missed detections exceeds the cost of false alarms.