Principle:Norrrrrrr lyn WAInjectBench Binary Classifier Training
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
A supervised learning step that trains a logistic regression classifier on embedding features to distinguish between benign and malicious samples.
Description
Binary Classifier Training fits a logistic regression model to the embedding feature matrix with binary labels (0=benign, 1=malicious). Logistic regression is chosen for its simplicity, interpretability, and effectiveness on well-separated embedding spaces. The WAInjectBench project uses two configurations:
- Text:
LogisticRegression(max_iter=1000)— standard settings - Image:
LogisticRegression(max_iter=2000, class_weight="balanced", n_jobs=-1)— more iterations, balanced class weights to handle potential class imbalance, and parallel fitting
After fitting, both variants print a classification_report on the training data for immediate quality inspection.
Usage
Use this after feature extraction to train a binary classifier. The fitted model is then serialized for use in the detection pipeline.
Theoretical Basis
Logistic regression models the probability of the positive class:
Where is the weight vector, is the bias, and is the sigmoid function. The model is trained by minimizing the regularized cross-entropy loss.
With class_weight="balanced", the loss for each class is inversely weighted by its frequency, preventing the classifier from being biased toward the majority class.