Implementation:Norrrrrrr lyn WAInjectBench LogisticRegression Fit
Appearance
| Knowledge Sources | |
|---|---|
| Domains | Machine_Learning, Classification |
| Last Updated | 2026-02-14 16:00 GMT |
Overview
Concrete tool for training binary logistic regression classifiers on embedding features, provided by scikit-learn as used in the WAInjectBench embedding trainers.
Description
Both text and image embedding trainers use sklearn.linear_model.LogisticRegression to fit a binary classifier. The text variant uses max_iter=1000; the image variant uses max_iter=2000, class_weight="balanced", and n_jobs=-1. After fitting, both print a classification_report on the training data.
Usage
Called after embedding extraction to fit a classifier on the feature matrix and label vector.
Code Reference
Source Location
- Repository: WAInjectBench
- File: train/embedding-t.py (L30-31), train/embedding-i.py (L48-53)
Signature
# Text variant (train/embedding-t.py:L30-31)
clf = LogisticRegression(max_iter=1000)
clf.fit(embeddings, labels)
# Image variant (train/embedding-i.py:L48-53)
clf = LogisticRegression(
max_iter=2000,
class_weight="balanced",
n_jobs=-1
)
clf.fit(embeddings, labels)
# Both variants follow with:
preds = clf.predict(embeddings)
print(classification_report(labels, preds))
Import
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| embeddings | np.ndarray | Yes | Feature matrix of shape (N, embedding_dim) |
| labels | List[int] | Yes | Binary labels (0=benign, 1=malicious) |
Outputs
| Name | Type | Description |
|---|---|---|
| clf | LogisticRegression | Fitted classifier object ready for predict() calls |
| classification_report | str (printed) | Precision/recall/F1 on training data |
Usage Examples
Training a Text Classifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report
import numpy as np
# Assume embeddings: np.ndarray (N, 384), labels: List[int]
clf = LogisticRegression(max_iter=1000)
clf.fit(embeddings, labels)
preds = clf.predict(embeddings)
print(classification_report(labels, preds))
Training an Image Classifier with Balanced Weights
clf = LogisticRegression(max_iter=2000, class_weight="balanced", n_jobs=-1)
clf.fit(embeddings, labels)
preds = clf.predict(embeddings)
print(classification_report(labels, preds))
Related Pages
Implements Principle
Requires Environment
Uses Heuristic
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment