Principle:Axolotl ai cloud Axolotl Training Performance Profiling

Knowledge Sources	SwanLab Profiling
Domains	Profiling, Training, Performance
Last Updated	2026-02-07 00:00 GMT

Overview

Instrumentation strategy for measuring execution time of training pipeline components using decorators and context managers with configurable throttling and filtering.

Description

Training Performance Profiling addresses the challenge of identifying bottlenecks in ML training loops without introducing significant overhead. The principle defines three instrumentation patterns of increasing granularity: (1) Decorator-based profiling for methods that should always be timed (training_step, compute_loss), with ~2-5 microsecond overhead, (2) Context manager profiling for timing specific code blocks within a method (forward pass vs backward pass separately), and (3) Advanced filtered profiling with configurable minimum duration thresholds and call frequency throttling for high-frequency operations. All patterns are exception-safe (duration logged even on failure) and support centralized configuration. Profiling data is reported to the experiment tracker under a dedicated namespace (e.g., profiling/).

Usage

Apply this principle when custom trainers need performance instrumentation for optimization work. The decorator pattern suits critical-path methods called once per step. The throttled context manager pattern suits helper methods called hundreds of times per step where logging every call would create noise and overhead.

Theoretical Basis

# Abstract profiling patterns
class ProfilingConfig:
    enabled: bool = True
    min_duration_ms: float = 0.0  # skip fast operations
    log_interval: int = 1         # log every Nth call

def profile_decorator(method):
    """Always-on profiling for critical path methods."""
    def wrapper(self, *args, **kwargs):
        start = time.perf_counter()
        try:
            return method(self, *args, **kwargs)
        finally:
            duration_ms = (time.perf_counter() - start) * 1000
            log_metric(f"profiling/{self.__class__.__name__}.{method.__name__}", duration_ms)
    return wrapper

def profile_context(obj, name, config=None):
    """Fine-grained profiling with optional throttling."""
    start = time.perf_counter()
    yield
    duration_ms = (time.perf_counter() - start) * 1000
    if config is None or (
        duration_ms >= config.min_duration_ms and
        call_count % config.log_interval == 0
    ):
        log_metric(f"profiling/{obj.__class__.__name__}.{name}", duration_ms)

Related Pages

Implementation:Axolotl_ai_cloud_Axolotl_SwanLab_Custom_Trainer_Profiling

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment