Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Principle:Axolotl ai cloud Axolotl Training Performance Profiling

From Leeroopedia


Knowledge Sources
Domains Profiling, Training, Performance
Last Updated 2026-02-07 00:00 GMT

Overview

Instrumentation strategy for measuring execution time of training pipeline components using decorators and context managers with configurable throttling and filtering.

Description

Training Performance Profiling addresses the challenge of identifying bottlenecks in ML training loops without introducing significant overhead. The principle defines three instrumentation patterns of increasing granularity: (1) Decorator-based profiling for methods that should always be timed (training_step, compute_loss), with ~2-5 microsecond overhead, (2) Context manager profiling for timing specific code blocks within a method (forward pass vs backward pass separately), and (3) Advanced filtered profiling with configurable minimum duration thresholds and call frequency throttling for high-frequency operations. All patterns are exception-safe (duration logged even on failure) and support centralized configuration. Profiling data is reported to the experiment tracker under a dedicated namespace (e.g., profiling/).

Usage

Apply this principle when custom trainers need performance instrumentation for optimization work. The decorator pattern suits critical-path methods called once per step. The throttled context manager pattern suits helper methods called hundreds of times per step where logging every call would create noise and overhead.

Theoretical Basis

# Abstract profiling patterns
class ProfilingConfig:
    enabled: bool = True
    min_duration_ms: float = 0.0  # skip fast operations
    log_interval: int = 1         # log every Nth call

def profile_decorator(method):
    """Always-on profiling for critical path methods."""
    def wrapper(self, *args, **kwargs):
        start = time.perf_counter()
        try:
            return method(self, *args, **kwargs)
        finally:
            duration_ms = (time.perf_counter() - start) * 1000
            log_metric(f"profiling/{self.__class__.__name__}.{method.__name__}", duration_ms)
    return wrapper

def profile_context(obj, name, config=None):
    """Fine-grained profiling with optional throttling."""
    start = time.perf_counter()
    yield
    duration_ms = (time.perf_counter() - start) * 1000
    if config is None or (
        duration_ms >= config.min_duration_ms and
        call_count % config.log_interval == 0
    ):
        log_metric(f"profiling/{obj.__class__.__name__}.{name}", duration_ms)

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment