Heuristic:Trailofbits Fickling Race Condition Prevention
| Knowledge Sources | |
|---|---|
| Domains | Security, Defensive_Programming |
| Last Updated | 2026-02-14 13:00 GMT |
Overview
Anti-TOCTOU (Time-of-Check-Time-of-Use) pattern used in fickling's safe loader to prevent race conditions between safety analysis and actual deserialization.
Description
Fickling's `load()` function performs safety analysis on pickle data and then deserializes it. A naive implementation would analyze the file, then re-read it for deserialization, creating a window where the file could be swapped by an attacker (a TOCTOU vulnerability). Fickling prevents this by parsing the pickle data into an internal representation, analyzing that representation, and then serializing it back to bytes for `pickle.loads()` — never re-reading the original file.
Usage
Use this heuristic when implementing any check-then-use pattern for pickle data. The same principle applies to any security tool that validates data before processing it: always operate on the same copy of the data for both validation and use.
The Insight (Rule of Thumb)
- Action: Parse pickle data into an internal representation first, analyze that representation, then call `pickle.loads(pickled.dumps())` on the serialized internal copy.
- Value: Eliminates the TOCTOU window entirely — the bytes passed to `pickle.loads()` are guaranteed to be the same bytes that were analyzed.
- Trade-off: Slightly higher memory usage (the pickle data is held in memory as both parsed opcodes and re-serialized bytes). Negligible for typical model files.
- Anti-pattern: Never do `check_safety(file); pickle.load(file)` — the file could be replaced between the two calls.
Reasoning
The comment in the source code explicitly documents this design decision:
Code evidence from `fickling/loader.py:18-24`:
pickled_data = Pickled.load(file, fail_on_decode_error=False)
result = check_safety(pickled=pickled_data, json_output_path=json_output_path)
if result.severity <= max_acceptable_severity and not pickled_data.has_invalid_opcode:
# We don't do pickle.load(file) because it could allow for a race
# condition where the file we check is not the same that gets
# loaded after the analysis.
return pickle.loads(pickled_data.dumps(), *args, **kwargs)
The flow is:
- `Pickled.load(file)` — parse the file into fickling's opcode representation
- `check_safety(pickled=pickled_data)` — analyze the parsed representation
- `pickle.loads(pickled_data.dumps())` — serialize the analyzed representation back to bytes and deserialize
At no point is the original file re-read. This guarantees that what was analyzed is what gets executed.