Implementation:VainF Torch Pruning Measure Latency
Appearance
Metadata
| Field | Value |
|---|---|
| Source | Torch-Pruning |
| Domains | Deep_Learning, Benchmarking |
| Last Updated | 2026-02-08 00:00 GMT |
Overview
Concrete tool for measuring GPU inference latency provided by Torch-Pruning.
Description
measure_latency runs the model in eval mode with warmup iterations, then measures inference time using torch.cuda.Event timing for GPU-accurate measurements. Returns mean and standard deviation of latency in milliseconds.
Code Reference
- Source:
torch_pruning/utils/benchmark.py, Lines 6-43 - Signature:
def measure_latency(model, example_inputs, repeat=300, warmup=50, run_fn=None):
"""Measure model inference latency.
Returns:
Tuple of (mean_latency_ms, std_latency_ms).
"""
- Import:
import torch_pruning as tp
tp.utils.benchmark.measure_latency
I/O Contract
Inputs
| Parameter | Type | Required | Default |
|---|---|---|---|
| model | nn.Module | Yes | — |
| example_inputs | Tensor | Yes | — |
| repeat | int | No | 300 |
| warmup | int | No | 50 |
| run_fn | Callable | No | None |
Outputs
(mean_latency_ms: float, std_latency_ms: float)
Usage Examples
import torch
import torch.nn as nn
import torch_pruning as tp
from torch_pruning.utils.benchmark import measure_latency
# Build a simple model and move to GPU
model = nn.Sequential(
nn.Conv2d(3, 64, 3, padding=1),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.Conv2d(64, 128, 3, padding=1),
).cuda().eval()
example_inputs = torch.randn(1, 3, 224, 224).cuda()
# Measure latency BEFORE pruning
mean_before, std_before = measure_latency(model, example_inputs)
print(f"Before pruning: {mean_before:.2f} +/- {std_before:.2f} ms")
# ... apply pruning ...
# Measure latency AFTER pruning
mean_after, std_after = measure_latency(model, example_inputs)
print(f"After pruning: {mean_after:.2f} +/- {std_after:.2f} ms")
print(f"Speedup: {mean_before / mean_after:.2f}x")
Related Pages
Page Connections
Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment