Environment:Openai Whisper Numba
| Knowledge Sources | |
|---|---|
| Domains | Infrastructure, Optimization |
| Last Updated | 2025-06-25 00:00 GMT |
Overview
Numba JIT compiler environment required for the CPU-based Dynamic Time Warping (DTW) backend used in Whisper's word-level timestamp alignment.
Description
Whisper uses Numba's `@jit(nopython=True)` decorator to compile the CPU DTW algorithm to optimized machine code at runtime. The `dtw_cpu()` and `backtrace()` functions in `whisper/timing.py` are JIT-compiled with Numba, enabling near-C performance for the O(N*M) dynamic programming computation. Numba is a hard dependency listed in `pyproject.toml` and is always imported at the top of `timing.py`.
Usage
Use this environment when word-level timestamps are requested (i.e., `word_timestamps=True` in `transcribe()`) and the DTW computation falls back to CPU. This occurs either because no CUDA GPU is available, or because the Triton GPU kernel fails to launch.
System Requirements
| Category | Requirement | Notes |
|---|---|---|
| OS | Linux, macOS, or Windows | Numba supports all major platforms |
| Hardware | CPU | Numba JIT compiles to native CPU code |
| Python | >= 3.8 | Must match Numba's supported Python versions |
Dependencies
Python Packages
- `numba` (any recent version)
- `numpy` (required by Numba)
Credentials
No credentials required.
Quick Install
pip install numba numpy
Code Evidence
Numba JIT-compiled DTW from `whisper/timing.py:57-58,82-83`:
@numba.jit(nopython=True)
def backtrace(trace: np.ndarray):
...
@numba.jit(nopython=True, parallel=True)
def dtw_cpu(x: np.ndarray):
...
CPU fallback path from `whisper/timing.py:141-151`:
def dtw(x: torch.Tensor) -> np.ndarray:
if x.is_cuda:
try:
return dtw_cuda(x)
except (RuntimeError, subprocess.CalledProcessError):
warnings.warn(
"Failed to launch Triton kernels, likely due to missing CUDA toolkit; "
"falling back to a slower DTW implementation..."
)
return dtw_cpu(x.double().cpu().numpy())
Common Errors
| Error Message | Cause | Solution |
|---|---|---|
| `ImportError: No module named 'numba'` | Numba not installed | `pip install numba` |
| Slow first call to `dtw_cpu()` | Numba JIT compilation on first invocation | Normal behavior; subsequent calls are fast |
| `NumbaDeprecationWarning` | Numba API deprecation warnings | Update Numba to latest version |
Compatibility Notes
- Always imported: Unlike Triton, Numba is imported unconditionally at the top of `whisper/timing.py`. It is a hard dependency.
- First-call overhead: The first call to any `@jit`-decorated function incurs a compilation delay (typically 1-3 seconds). Subsequent calls use the cached compiled code.
- Parallel execution: `dtw_cpu()` uses `parallel=True` which enables Numba's automatic parallelization of suitable loop patterns.