Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Environment:Openai Whisper Numba

From Leeroopedia
Knowledge Sources
Domains Infrastructure, Optimization
Last Updated 2025-06-25 00:00 GMT

Overview

Numba JIT compiler environment required for the CPU-based Dynamic Time Warping (DTW) backend used in Whisper's word-level timestamp alignment.

Description

Whisper uses Numba's `@jit(nopython=True)` decorator to compile the CPU DTW algorithm to optimized machine code at runtime. The `dtw_cpu()` and `backtrace()` functions in `whisper/timing.py` are JIT-compiled with Numba, enabling near-C performance for the O(N*M) dynamic programming computation. Numba is a hard dependency listed in `pyproject.toml` and is always imported at the top of `timing.py`.

Usage

Use this environment when word-level timestamps are requested (i.e., `word_timestamps=True` in `transcribe()`) and the DTW computation falls back to CPU. This occurs either because no CUDA GPU is available, or because the Triton GPU kernel fails to launch.

System Requirements

Category Requirement Notes
OS Linux, macOS, or Windows Numba supports all major platforms
Hardware CPU Numba JIT compiles to native CPU code
Python >= 3.8 Must match Numba's supported Python versions

Dependencies

Python Packages

  • `numba` (any recent version)
  • `numpy` (required by Numba)

Credentials

No credentials required.

Quick Install

pip install numba numpy

Code Evidence

Numba JIT-compiled DTW from `whisper/timing.py:57-58,82-83`:

@numba.jit(nopython=True)
def backtrace(trace: np.ndarray):
    ...

@numba.jit(nopython=True, parallel=True)
def dtw_cpu(x: np.ndarray):
    ...

CPU fallback path from `whisper/timing.py:141-151`:

def dtw(x: torch.Tensor) -> np.ndarray:
    if x.is_cuda:
        try:
            return dtw_cuda(x)
        except (RuntimeError, subprocess.CalledProcessError):
            warnings.warn(
                "Failed to launch Triton kernels, likely due to missing CUDA toolkit; "
                "falling back to a slower DTW implementation..."
            )
    return dtw_cpu(x.double().cpu().numpy())

Common Errors

Error Message Cause Solution
`ImportError: No module named 'numba'` Numba not installed `pip install numba`
Slow first call to `dtw_cpu()` Numba JIT compilation on first invocation Normal behavior; subsequent calls are fast
`NumbaDeprecationWarning` Numba API deprecation warnings Update Numba to latest version

Compatibility Notes

  • Always imported: Unlike Triton, Numba is imported unconditionally at the top of `whisper/timing.py`. It is a hard dependency.
  • First-call overhead: The first call to any `@jit`-decorated function incurs a compilation delay (typically 1-3 seconds). Subsequent calls use the cached compiled code.
  • Parallel execution: `dtw_cpu()` uses `parallel=True` which enables Numba's automatic parallelization of suitable loop patterns.

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment