Implementation:Norrrrrrr lyn WAInjectBench torch save Checkpoint

Knowledge Sources	WAInjectBench PyTorch torch.save
Domains	Model_Management, Deep_Learning
Last Updated	2026-02-14 16:00 GMT

Overview

Concrete tool for saving LLaVA fine-tuning checkpoints with training metadata, provided by PyTorch's torch.save as used in the WAInjectBench train/llava-ft module.

Description

The checkpoint saving in train/llava-ft.py (L394-408) uses torch.save to write a dictionary containing the model state dict, optimizer state dict, epoch number, best TPR, and AMP configuration. The filename follows the pattern best_epoch{N}_tpr{X.XXXX}.pt under the --out_dir directory (default "runs/ft"). Checkpoints are only saved when the current epoch's TPR exceeds the previous best.

Usage

Called automatically within the validation loop when a new best TPR is achieved.

Code Reference

Source Location

Repository: WAInjectBench
File: train/llava-ft.py (L394-408)

Signature

if tpr > best_tpr:
    best_tpr = tpr
    best_path = os.path.join(args.out_dir, f"best_epoch{epoch}_tpr{tpr:.4f}.pt")
    torch.save(
        {
            "epoch": epoch,
            "model_state": model.state_dict(),
            "optimizer_state": optim.state_dict(),
            "best_tpr": best_tpr,
            "amp_enabled": state.use_amp,
            "amp_dtype": str(amp_dtype) if amp_dtype is not None else "fp32",
        },
        best_path
    )
    print(f"Saved: {best_path}")

Import

import torch
import os

I/O Contract

Inputs

Name	Type	Required	Description
model	nn.Module	Yes	Trained model whose state_dict to save
optim	torch.optim.AdamW	Yes	Optimizer whose state_dict to save
epoch	int	Yes	Current epoch number
best_tpr	float	Yes	Best TPR achieved
state	TrainState	Yes	AMP configuration
out_dir	str	Yes	Output directory (default "runs/ft")

Outputs

Name	Type	Description
.pt file	File	Checkpoint at {out_dir}/best_epoch{N}_tpr{X.XXXX}.pt containing model_state, optimizer_state, epoch, best_tpr, amp_enabled, amp_dtype

Usage Examples

Saving and Loading Checkpoints

import torch

# Save checkpoint
torch.save({
    "epoch": 2,
    "model_state": model.state_dict(),
    "optimizer_state": optim.state_dict(),
    "best_tpr": 0.9523,
    "amp_enabled": True,
    "amp_dtype": "torch.bfloat16",
}, "runs/ft/best_epoch2_tpr0.9523.pt")

# Load checkpoint for inference
ckpt = torch.load("runs/ft/best_epoch2_tpr0.9523.pt")
model.load_state_dict(ckpt["model_state"])
print(f"Loaded epoch {ckpt['epoch']} with TPR={ckpt['best_tpr']}")

Related Pages

Implements Principle

Principle:Norrrrrrr_lyn_WAInjectBench_Checkpoint_Export

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment