Implementation:Openai Whisper Load Model

Overview

whisper.load_model() loads a pre-trained Whisper model from a checkpoint file, handling automatic downloading, integrity verification, architecture reconstruction, and device placement. This is the primary entry point for obtaining a usable Whisper model instance.

Source

File: whisper/__init__.py:L103-161
Repository: https://github.com/openai/whisper

Signature

def load_model(
    name: str,
    device: Optional[Union[str, torch.device]] = None,
    download_root: str = None,
    in_memory: bool = False,
) -> Whisper:

Import

import whisper
# or
from whisper import load_model

Parameters

Parameter	Type	Default	Description
name	str	(required)	Model name (tiny, tiny.en, base, base.en, small, small.en, medium, medium.en, large-v1, large-v2, large-v3, turbo) or path to a local checkpoint file
device	Optional[Union[str, torch.device]]	None	PyTorch device for model placement. Auto-detects CUDA if None; falls back to CPU if CUDA is unavailable
download_root	str	None	Directory to download and cache model checkpoints. Defaults to ~/.cache/whisper
in_memory	bool	False	If True, preloads the entire checkpoint file into host memory before deserialization

Inputs and Outputs

Inputs

A model name string identifying one of the official Whisper model variants, or an absolute file path to a custom checkpoint file

Outputs

A Whisper model instance (whisper.model.Whisper) fully loaded on the specified device with alignment heads configured for official models

Behavior

If name matches an official model name, resolves the download URL from the internal _MODELS dictionary
Downloads the checkpoint file to download_root if not already cached, verifying the SHA256 hash against the expected value
If name is a file path, uses the checkpoint directly without downloading
Opens the checkpoint file (optionally loading into memory if in_memory is True)
Deserializes using torch.load() with appropriate device mapping
Extracts dims dictionary from the checkpoint and constructs a ModelDimensions named tuple
Creates a Whisper model instance from those dimensions
Loads the model_state_dict weights into the model
If the model is an official variant, calls set_alignment_heads() with pre-computed alignment head data from _ALIGNMENT_HEADS
Moves the model to the target device and returns it

Example

import whisper

# Load the base model (auto-detects device)
model = whisper.load_model("base")

# Load a large model explicitly on GPU
model = whisper.load_model("large-v3", device="cuda")

# Load a custom checkpoint from a local file
model = whisper.load_model("/path/to/custom_model.pt")

# Load with custom cache directory and in-memory mode
model = whisper.load_model("small", download_root="/data/models", in_memory=True)

Notes

The first call for a given model name triggers a download; subsequent calls use the cached checkpoint
The function automatically handles both FP16 and FP32 weights depending on the checkpoint
English-only models (e.g., tiny.en) have a smaller vocabulary and cannot perform multilingual tasks
The turbo model is a distilled variant of large-v3 optimized for faster inference

Metadata

Principle:Openai_Whisper_Model_Loading Environment:Openai_Whisper_PyTorch_CUDA 2025-06-25 00:00 GMT

Page Connections

Double-click a node to navigate. Hold to expand connections.

Principle

Implementation

Heuristic

Environment