Principle:Trailofbits Fickling Pickle Scanner Benchmarking
| Knowledge Sources | |
|---|---|
| Domains | Security, Benchmarking, Pickle_Safety |
| Last Updated | 2026-02-14 14:00 GMT |
Overview
Methodology for quantitatively evaluating the detection accuracy of pickle security scanning tools against datasets of clean and malicious files using confusion matrix metrics.
Description
Pickle Scanner Benchmarking is the systematic evaluation of tools designed to detect malicious content in Python pickle files. Given the inherent security risks of pickle deserialization (arbitrary code execution), multiple scanning tools have been developed (Fickling, Modelscan, Picklescan, Model Unpickler). This principle establishes a framework for comparing their effectiveness using standard binary classification metrics: true positives (correctly flagged malicious files), true negatives (correctly passed clean files), false positives (clean files incorrectly flagged), and false negatives (malicious files missed). The benchmark samples from clean and malicious datasets at a configurable ratio to simulate realistic scanning conditions.
Usage
Apply this principle when evaluating or comparing pickle scanning tools for deployment in ML model supply chain security. It is relevant when building a quantitative case for tool selection, identifying detection gaps in specific tools, or validating that a scanner correctly handles diverse payload types without excessive false positives.
Theoretical Basis
The core methodology is based on binary classification evaluation:
Confusion Matrix:
# Abstract evaluation algorithm
for each file in sampled_files:
expected = "safe" if file in clean_set else "malicious"
result = scanner.scan(file)
if expected == "safe" and result == "safe": record TN
if expected == "safe" and result == "malicious": record FP
if expected == "malicious" and result == "safe": record FN
if expected == "malicious" and result == "malicious": record TP
# Derived metrics
detection_rate = TP / (TP + FN) # Sensitivity / Recall
specificity = TN / (TN + FP) # True negative rate
The sampling strategy uses a configurable clean-to-malicious ratio to reflect real-world conditions where most files are benign.