Implementation:Microsoft DeepSpeedExamples LossTracker AverageMeter Accuracy
| Knowledge Sources | |
|---|---|
| Domains | Deep Learning, Training Utilities |
| Last Updated | 2026-02-07 12:00 GMT |
Overview
Provides training utility classes including LossTracker, AverageMeter, ProgressMeter, and top-k accuracy computation for Vision Transformer fine-tuning experiments.
Description
This module contains several utility classes and functions used during training and evaluation of vision models. The LossTracker class aggregates loss, top-1 accuracy, and top-5 accuracy metrics through internal AverageMeter instances and formats progress output via a ProgressMeter. It provides a simple update() interface that computes accuracy from model outputs and targets, and a display() method that prints formatted progress at configurable intervals.
The AverageMeter class tracks running averages of metric values by maintaining a sum and count. It supports weighted updates via the n parameter and provides formatted string output. The ProgressMeter class formats and displays batch-level progress information with configurable prefix strings and batch formatting.
Additional utilities include get_model for creating and wrapping models with DataParallel and optional half-precision support, get_optimizer for creating optimizers (SGD, Nesterov SGD, RMSprop, Adagrad, AdamW), get_scheduler for creating learning rate schedulers (constant, step, exponential, cosine, multi-step), the accuracy function for computing top-k accuracy metrics, and run_cmd for executing system commands as subprocesses.
Usage
Use these utilities in ViT fine-tuning training loops to track and display training metrics, create models with appropriate parallelism wrappers, configure optimizers and schedulers, and compute classification accuracy.
Code Reference
Source Location
- Repository: Microsoft_DeepSpeedExamples
- File: training/data_efficiency/vit_finetuning/utils/utils.py
- Lines: 1-178
Signature
class LossTracker(object):
def __init__(self, num, prefix='', print_freq=1):
...
def update(self, loss, output, target):
...
def display(self, step):
...
class AverageMeter(object):
def __init__(self, name, fmt=':f'):
...
def reset(self):
...
def update(self, val, n=1):
...
class ProgressMeter(object):
def __init__(self, num_batches, meters, prefix=""):
...
def display(self, batch):
...
def accuracy(output, target, topk=(1,)):
...
def get_model(model_name, nchannels=3, imsize=32, nclasses=10, half=False):
...
def get_optimizer(optimizer_name, parameters, lr, momentum=0, weight_decay=0):
...
def get_scheduler(scheduler_name, optimizer, num_epochs, **kwargs):
...
Import
from utils.utils import LossTracker, AverageMeter, ProgressMeter, accuracy, get_model, get_optimizer, get_scheduler
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| num | int | Yes | Total number of batches (used for progress display formatting in LossTracker) |
| prefix | str | No | Prefix string for progress display (default: ) |
| print_freq | int | No | Display frequency in steps (default: 1) |
| loss | torch.Tensor | Yes | Scalar loss tensor (for LossTracker.update) |
| output | torch.Tensor | Yes | Model output logits of shape (batch_size, num_classes) |
| target | torch.Tensor | Yes | Ground truth labels of shape (batch_size,) |
| topk | tuple | No | Tuple of k values for top-k accuracy (default: (1,)) |
Outputs
| Name | Type | Description |
|---|---|---|
| tracker.losses.avg | float | Running average of the loss |
| tracker.top1.avg | float | Running average of top-1 accuracy |
| tracker.top5.avg | float | Running average of top-5 accuracy |
| accuracy result | list[torch.Tensor] | List of top-k accuracy percentages |
Usage Examples
from utils.utils import LossTracker, accuracy, get_model, get_optimizer
# Set up training components
model = get_model('resnet18', nchannels=3, imsize=32, nclasses=10)
optimizer = get_optimizer('adam', model.parameters(), lr=1e-3, weight_decay=1e-4)
# Create loss tracker for an epoch
tracker = LossTracker(num=len(train_loader), prefix='Train: ', print_freq=10)
# Training loop
for step, (images, targets) in enumerate(train_loader):
outputs = model(images.cuda())
loss = criterion(outputs, targets.cuda())
loss.backward()
optimizer.step()
optimizer.zero_grad()
tracker.update(loss, outputs, targets.cuda())
tracker.display(step)
print(f"Epoch accuracy: {tracker.top1.avg:.2f}%")