Workflow:Trailofbits Fickling Pickle Decompilation and Tracing
| Knowledge Sources | |
|---|---|
| Domains | Reverse_Engineering, Malware_Analysis, Forensics |
| Last Updated | 2026-02-14 13:00 GMT |
Overview
End-to-end process for decompiling Python pickle bytecode into human-readable Python source code and tracing its virtual machine execution for forensic analysis.
Description
This workflow enables security researchers and incident responders to understand the behavior of pickle files without executing them. Fickling's symbolic interpreter converts raw pickle bytecode into a Python AST, which can be rendered as readable Python source code. The tracing mode provides a step-by-step log of every pickle virtual machine operation: each opcode executed, values pushed to and popped from the stack, memo table mutations, and variable assignments.
This capability is essential for malware analysis of pickle-based exploits, where payloads are often obfuscated through nested imports, getattr chains, and multi-stage construction patterns. The decompiled output reveals the true intent of the serialized object.
Usage
Execute this workflow when investigating a suspicious pickle file, performing incident response on a compromised ML pipeline, analyzing a known malicious sample for threat intelligence, or educating teams about pickle deserialization attacks. It is also useful for debugging pickle serialization issues.
Execution Steps
Step 1: Obtain the Pickle File
Acquire the target pickle file for analysis. This may be a standalone .pkl file, a pickle embedded within a PyTorch .pt/.pth ZIP archive, a NumPy .npy file with object arrays, or raw pickle bytes from a network capture.
Key considerations:
- Do not use pickle.load() on untrusted files; use Fickling's safe loader or operate directly on bytes
- For PyTorch files, extract data.pkl from the ZIP archive first, or use PyTorchModelWrapper
- Pickle bytes can be loaded from any file-like object or BytesIO wrapper
Step 2: Parse into Fickling Representation
Load the raw bytes into Fickling's Pickled object, which parses the opcode stream without executing anything. For files containing multiple stacked pickle streams, use StackedPickle.load() to parse all of them.
Key considerations:
- The parser handles all pickle protocol versions (0 through 5)
- Invalid opcodes are flagged but do not halt parsing
- The Pickled object provides access to the raw opcode list, properties, and AST
Step 3: Decompile to Python Source
Create an Interpreter instance from the Pickled object and generate a Python AST. Unparse the AST to produce readable Python source code that represents what the pickle would execute during deserialization.
Key considerations:
- The output is equivalent Python code, not the exact original source
- Variable names are auto-generated (_var0, _var1, etc.) since pickle bytecode has no variable names
- The result variable holds the final deserialized object
- For stacked pickles, each sub-pickle gets its own result variable (result0, result1, etc.)
Step 4: Trace Execution (Optional)
For deeper analysis, create a Trace object wrapping the Interpreter to get a step-by-step execution log. The trace shows every opcode, stack push/pop, memo read/write, and statement generated.
Key considerations:
- The trace output interleaves opcode names with indented stack operations
- This is especially useful for understanding multi-stage exploit chains
- The trace runs the interpreter to completion and returns the final AST
Step 5: Analyze the Decompiled Output
Review the decompiled Python source or trace output to understand the pickle's behavior. Look for import chains targeting dangerous modules, calls to eval, exec, os.system, getattr chains that construct callable references, and unused variable assignments that indicate side effects.
Key considerations:
- Common malicious patterns include: __import__('os').system(...), getattr(__import__('builtins'), 'eval')(...)
- Unused variables assigned to function call results are a strong indicator of malicious side effects
- The decompiled output can be shared in security advisories and threat reports