Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Heuristic:Openai Whisper Log Probability Threshold

From Leeroopedia
Knowledge Sources
Domains Decoding, Quality_Control
Last Updated 2025-06-25 00:00 GMT

Overview

Average log probability threshold of -1.0 used to detect low-confidence decoder output, triggering temperature fallback for re-decoding.

Description

Whisper computes the average log probability across all sampled tokens in a decoded segment. When this value falls below -1.0, it indicates the model has low confidence in its output — the generated tokens are unlikely under the model's distribution. This triggers the temperature fallback mechanism to retry decoding with higher randomness.

Usage

Use this heuristic to reject low-confidence transcriptions. The threshold is configurable via the `logprob_threshold` parameter in `transcribe()`. Set to `None` to disable the check. More negative values are more permissive; less negative values are stricter.

The Insight (Rule of Thumb)

  • Action: Set `logprob_threshold=-1.0` (default) in `transcribe()`.
  • Value: -1.0 — average log probability below this is considered a failed decode.
  • Trade-off: Too high (e.g., -0.5) rejects many valid transcriptions; too low (e.g., -2.0) allows garbage through. The threshold also interacts with the no-speech detector.
  • Interaction: When `no_speech_prob > no_speech_threshold` AND `avg_logprob < logprob_threshold`, the segment is classified as silence rather than a failed decode, and no fallback is triggered.

Reasoning

Average log probability reflects the model's confidence in its output. For clean speech that the model handles well, average log probabilities are typically above -0.5. For difficult audio or hallucinated output, the probabilities drop significantly. The -1.0 threshold catches clear model failures while tolerating moderate difficulty.

Code evidence from `whisper/transcribe.py:45,209-213`:

logprob_threshold: Optional[float] = -1.0,
if (
    logprob_threshold is not None
    and decode_result.avg_logprob < logprob_threshold
):
    needs_fallback = True  # average log probability is too low

Silence detection interaction from `whisper/transcribe.py:214-220`:

if (
    no_speech_threshold is not None
    and decode_result.no_speech_prob > no_speech_threshold
    and logprob_threshold is not None
    and decode_result.avg_logprob < logprob_threshold
):
    needs_fallback = False  # silence

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment