Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Norrrrrrr lyn WAInjectBench Model Serialization

From Leeroopedia
Knowledge Sources
Domains Machine_Learning, Model_Management
Last Updated 2026-02-14 16:00 GMT

Overview

A model persistence step that serializes trained scikit-learn classifiers to disk for later use in the detection pipeline.

Description

After training, the fitted LogisticRegression classifier must be saved to disk so the detection modules can load it at inference time. The WAInjectBench project uses joblib.dump for serialization, which is more efficient than Python's pickle for numpy-heavy objects. The output filename is derived from the training JSONL filename with a _logreg.pkl suffix.

Usage

Use this as the final step of the embedding classifier training pipeline. The saved model file is consumed by the corresponding detector modules (detector_text/embedding-t.py and detector_image/embedding-i.py) at inference time.

Theoretical Basis

# Model serialization pattern
save_path = os.path.join(output_dir, f"{dataset_stem}_logreg.pkl")
joblib.dump(fitted_classifier, save_path)
# Later: clf = joblib.load(save_path)

Joblib uses numpy-aware compression that is significantly faster and more compact than standard pickle for objects containing large numpy arrays (such as sklearn model weights).

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment