Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Openai Whisper Load Model

From Leeroopedia

Overview

whisper.load_model() loads a pre-trained Whisper model from a checkpoint file, handling automatic downloading, integrity verification, architecture reconstruction, and device placement. This is the primary entry point for obtaining a usable Whisper model instance.

Source

Signature

def load_model(
    name: str,
    device: Optional[Union[str, torch.device]] = None,
    download_root: str = None,
    in_memory: bool = False,
) -> Whisper:

Import

import whisper
# or
from whisper import load_model

Parameters

Parameter Type Default Description
name str (required) Model name (tiny, tiny.en, base, base.en, small, small.en, medium, medium.en, large-v1, large-v2, large-v3, turbo) or path to a local checkpoint file
device Optional[Union[str, torch.device]] None PyTorch device for model placement. Auto-detects CUDA if None; falls back to CPU if CUDA is unavailable
download_root str None Directory to download and cache model checkpoints. Defaults to ~/.cache/whisper
in_memory bool False If True, preloads the entire checkpoint file into host memory before deserialization

Inputs and Outputs

Inputs

  • A model name string identifying one of the official Whisper model variants, or an absolute file path to a custom checkpoint file

Outputs

  • A Whisper model instance (whisper.model.Whisper) fully loaded on the specified device with alignment heads configured for official models

Behavior

  1. If name matches an official model name, resolves the download URL from the internal _MODELS dictionary
  2. Downloads the checkpoint file to download_root if not already cached, verifying the SHA256 hash against the expected value
  3. If name is a file path, uses the checkpoint directly without downloading
  4. Opens the checkpoint file (optionally loading into memory if in_memory is True)
  5. Deserializes using torch.load() with appropriate device mapping
  6. Extracts dims dictionary from the checkpoint and constructs a ModelDimensions named tuple
  7. Creates a Whisper model instance from those dimensions
  8. Loads the model_state_dict weights into the model
  9. If the model is an official variant, calls set_alignment_heads() with pre-computed alignment head data from _ALIGNMENT_HEADS
  10. Moves the model to the target device and returns it

Example

import whisper

# Load the base model (auto-detects device)
model = whisper.load_model("base")

# Load a large model explicitly on GPU
model = whisper.load_model("large-v3", device="cuda")

# Load a custom checkpoint from a local file
model = whisper.load_model("/path/to/custom_model.pt")

# Load with custom cache directory and in-memory mode
model = whisper.load_model("small", download_root="/data/models", in_memory=True)

Notes

  • The first call for a given model name triggers a download; subsequent calls use the cached checkpoint
  • The function automatically handles both FP16 and FP32 weights depending on the checkpoint
  • English-only models (e.g., tiny.en) have a smaller vocabulary and cannot perform multilingual tasks
  • The turbo model is a distilled variant of large-v3 optimized for faster inference

Metadata

Principle:Openai_Whisper_Model_Loading Environment:Openai_Whisper_PyTorch_CUDA 2025-06-25 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment