Jump to content

Connect SuperML | Leeroopedia MCP: Equip your AI agents with best practices, code verification, and debugging knowledge. Powered by Leeroo — building Organizational Superintelligence. Contact us at founders@leeroo.com.

Implementation:Speechbrain Speechbrain Compute WER Tool

From Leeroopedia


Knowledge Sources
Domains ASR, Evaluation
Last Updated 2026-02-09 00:00 GMT

Overview

Concrete tool for computing Word Error Rate (WER) and Levenshtein alignments between reference and hypothesis transcriptions provided by the SpeechBrain library.

Description

This command-line script computes Word Error Rate and related metrics given a reference and a hypothesis text file. It closely matches the behavior of Kaldi's compute_wer binary. The script reads Kaldi-style text files where each line starts with an utterance ID followed by space-separated tokens. It supports three modes for handling missing hypotheses: "strict" (raises error), "all" (treats missing as empty), and "present" (only scores found hypotheses). Additional features include: printing human-readable edit distance alignments between references and hypotheses, identifying utterances with the highest WER, and (when provided with a utt2spk mapping) identifying speakers with the highest WER. The implementation delegates to speechbrain.utils.edit_distance for Levenshtein computation and speechbrain.dataio.wer for formatted output.

Usage

Run as a standalone CLI tool after ASR decoding to evaluate transcription quality. Accepts the same file format as Kaldi's text files.

Code Reference

Source Location

Signature

def _plain_text_reader(path):
    """Yields (key, token_list) from a Kaldi-style text file."""
    ...

def _plain_text_keydict(path):
    """Returns dict mapping utterance ID to token list."""
    ...

def _utt2spk_keydict(path):
    """Returns dict mapping utterance ID to speaker ID."""
    ...

class SmartFormatter(argparse.HelpFormatter):
    """Custom formatter for argparse help text."""
    ...

# Main: parses args, computes wer_details_by_utterance, wer_summary,
# optionally prints top WER utterances, speaker-level WER, and alignments

Import

python compute_wer.py ref.txt hyp.txt [options]

I/O Contract

Inputs

Name Type Required Description
ref str Yes Path to reference text file (utterance-ID on first column)
hyp str Yes Path to hypothesis text file (utterance-ID on first column)
--mode str No How to treat missing hypotheses: "present", "all", or "strict" (default: "strict")
--print-top-wer flag No Print a list of utterances with the highest WER
--print-alignments flag No Print Levenshtein alignments between all refs and hyps
--align-separator str No Token separator in alignment output (default: " ; ")
--align-empty str No Symbol for empty alignment slots (default: "<eps>")
--utt2spk str No Path to utterance-to-speaker mapping file

Outputs

Name Type Description
WER summary stdout Overall WER, number of insertions, deletions, substitutions, and total words
Top WER utterances stdout Utterances with highest error rates (with --print-top-wer)
Top WER speakers stdout Speakers with highest error rates (with --utt2spk)
Alignments stdout Human-readable edit distance alignments (with --print-alignments)

Usage Examples

# Basic WER computation
python compute_wer.py ref.txt hyp.txt

# Show top error utterances and alignments
python compute_wer.py ref.txt hyp.txt --print-top-wer --print-alignments

# With speaker-level analysis
python compute_wer.py ref.txt hyp.txt --print-top-wer --utt2spk utt2spk.txt

# Only score hypotheses that are present (skip missing)
python compute_wer.py ref.txt hyp.txt --mode present

Related Pages

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment