Principle:LaurentMazare Tch rs Weight Loading
| Knowledge Sources | |
|---|---|
| Domains | Deep_Learning, Model_Serialization |
| Last Updated | 2026-02-08 14:00 GMT |
Overview
Mechanism for restoring pretrained or saved model parameters from disk by matching named tensors to VarStore variables.
Description
Weight loading deserializes tensor data from a file and copies it into the corresponding named variables in a VarStore. The process matches tensors by name, performing the copy within a no_grad context to avoid tracking the operation in the computation graph. Multiple serialization formats are supported: libtorch C++ format (.ot), safetensors (.safetensors), and pickle format (.bin, .pt). Format detection is automatic based on file extension. Missing or extra keys are silently ignored, enabling partial loading scenarios like transfer learning.
Usage
Use after model instantiation to load pretrained weights for inference or fine-tuning. The VarStore must already contain the model's parameter structure (created by layer constructors).
Theoretical Basis
Weight Loading Process:
1. Open file, detect format by extension
2. Deserialize named tensors from file
3. For each (name, tensor) in file:
a. Look up name in VarStore's named_variables
b. If found: no_grad { variable.copy_(tensor) }
c. If not found: skip silently
4. Missing VarStore variables keep their random initialization