Heuristic:LaurentMazare Tch rs Safetensors Format Preference
| Knowledge Sources | |
|---|---|
| Domains | Optimization, Deep_Learning |
| Last Updated | 2026-02-08 13:00 GMT |
Overview
Prefer the safetensors format over pickle-based `.pt`/`.bin` files for model weight serialization, gaining zero-copy loading, reduced memory usage, and elimination of arbitrary code execution risks.
Description
tch-rs supports three weight file formats: safetensors (`.safetensors`), pickle (`.bin`/`.pt`), and the native libtorch C++ format. The safetensors format, developed by HuggingFace, provides zero-copy memory-mapped loading, meaning the file can be read without allocating additional memory beyond the file itself. It also avoids Python's pickle deserialization, which can execute arbitrary code. The `VarStore::load()` method automatically detects the format from the file extension.
Usage
Use safetensors format whenever saving or distributing model weights, especially for LLM workflows. The LLaMA example requires safetensors (via `convert_checkpoint.py`). When exporting weights from PyTorch Python, use `safetensors.torch.save_file()` instead of `torch.save()`. Note that safetensors cannot save sparse or non-contiguous tensors.
The Insight (Rule of Thumb)
- Action: Save weights as `.safetensors` instead of `.pt` or `.bin`. In Python: `safetensors.torch.save_file(model.state_dict(), 'model.safetensors')`.
- Value: N/A (format choice, not a numeric parameter).
- Trade-off: Cannot serialize sparse tensors or non-contiguous tensors. Must make tensors contiguous before saving.
- Compatibility: Requires `safetensors` >= 0.3.0 (Rust dependency in tch-rs). Python side requires `pip install safetensors`.
Reasoning
Zero-copy deserialization means the memory-mapped file serves as the tensor data directly, without intermediate buffers. This is critical for large language models (7B+ parameters) where duplicating weights during loading could exceed available RAM. The pickle format requires full deserialization into Python objects and then conversion, doubling memory usage. Additionally, pickle can execute arbitrary Python code during deserialization, making it a security risk for untrusted model files.
Code Evidence
Format auto-detection from `src/nn/var_store.rs:231-234`:
/// The format of the file is deduced from the file extension:
/// - `.safetensors`: The file is assumed to be in safetensors format.
/// - `.bin` or `.pt`: The file is assumed to be in pickle format.
/// - Otherwise, the file is assumed to be in libtorch C++ module format.
Safetensors validation constraints from `src/tensor/safetensors.rs:72-78`:
if tensor.is_sparse() {
return Err(TchError::Convert("Cannot save sparse tensors".to_string()));
}
if !tensor.is_contiguous() {
return Err(TchError::Convert("Cannot save non contiguous tensors".to_string()));
}