Principle:Norrrrrrr lyn WAInjectBench Validation Checkpoint Selection

Knowledge Sources	PyTorch Model Saving
Domains	Evaluation, Model_Selection
Last Updated	2026-02-14 16:00 GMT

Overview

A model selection strategy that evaluates validation performance after each training epoch and selects the checkpoint with the highest True Positive Rate.

Description

After each training epoch, the model is switched to evaluation mode and run through the validation dataset to compute confusion matrix metrics (TP, TN, FP, FN). From these, TPR and FPR are calculated. The checkpoint with the highest TPR across all epochs is selected as the best model.

This TPR-maximizing strategy reflects the security-oriented nature of prompt injection detection, where missing an attack (false negative) is generally more costly than raising a false alarm (false positive).

Usage

Use this at the end of each training epoch to evaluate model quality and track the best-performing checkpoint for later deployment.

Theoretical Basis

Confusion matrix metrics: $T P R = \frac{T P}{T P + F N}, F P R = \frac{F P}{F P + T N}$

Checkpoint selection criterion:

# Select by maximum TPR
if current_tpr > best_tpr:
    best_tpr = current_tpr
    save_checkpoint(model, epoch, best_tpr)

This is equivalent to maximizing recall for the positive (malicious) class, which is the appropriate objective when the cost of missed detections exceeds the cost of false alarms.

Related Pages

Implemented By

Implementation:Norrrrrrr_lyn_WAInjectBench_Validation_TPR_Selection

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment