Implementation:Speechbrain Speechbrain Compute WER Tool
| Knowledge Sources | |
|---|---|
| Domains | ASR, Evaluation |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for computing Word Error Rate (WER) and Levenshtein alignments between reference and hypothesis transcriptions provided by the SpeechBrain library.
Description
This command-line script computes Word Error Rate and related metrics given a reference and a hypothesis text file. It closely matches the behavior of Kaldi's compute_wer binary. The script reads Kaldi-style text files where each line starts with an utterance ID followed by space-separated tokens. It supports three modes for handling missing hypotheses: "strict" (raises error), "all" (treats missing as empty), and "present" (only scores found hypotheses). Additional features include: printing human-readable edit distance alignments between references and hypotheses, identifying utterances with the highest WER, and (when provided with a utt2spk mapping) identifying speakers with the highest WER. The implementation delegates to speechbrain.utils.edit_distance for Levenshtein computation and speechbrain.dataio.wer for formatted output.
Usage
Run as a standalone CLI tool after ASR decoding to evaluate transcription quality. Accepts the same file format as Kaldi's text files.
Code Reference
Source Location
- Repository: SpeechBrain
- File: tools/compute_wer.py
Signature
def _plain_text_reader(path):
"""Yields (key, token_list) from a Kaldi-style text file."""
...
def _plain_text_keydict(path):
"""Returns dict mapping utterance ID to token list."""
...
def _utt2spk_keydict(path):
"""Returns dict mapping utterance ID to speaker ID."""
...
class SmartFormatter(argparse.HelpFormatter):
"""Custom formatter for argparse help text."""
...
# Main: parses args, computes wer_details_by_utterance, wer_summary,
# optionally prints top WER utterances, speaker-level WER, and alignments
Import
python compute_wer.py ref.txt hyp.txt [options]
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| ref | str | Yes | Path to reference text file (utterance-ID on first column) |
| hyp | str | Yes | Path to hypothesis text file (utterance-ID on first column) |
| --mode | str | No | How to treat missing hypotheses: "present", "all", or "strict" (default: "strict") |
| --print-top-wer | flag | No | Print a list of utterances with the highest WER |
| --print-alignments | flag | No | Print Levenshtein alignments between all refs and hyps |
| --align-separator | str | No | Token separator in alignment output (default: " ; ") |
| --align-empty | str | No | Symbol for empty alignment slots (default: "<eps>") |
| --utt2spk | str | No | Path to utterance-to-speaker mapping file |
Outputs
| Name | Type | Description |
|---|---|---|
| WER summary | stdout | Overall WER, number of insertions, deletions, substitutions, and total words |
| Top WER utterances | stdout | Utterances with highest error rates (with --print-top-wer) |
| Top WER speakers | stdout | Speakers with highest error rates (with --utt2spk) |
| Alignments | stdout | Human-readable edit distance alignments (with --print-alignments) |
Usage Examples
# Basic WER computation
python compute_wer.py ref.txt hyp.txt
# Show top error utterances and alignments
python compute_wer.py ref.txt hyp.txt --print-top-wer --print-alignments
# With speaker-level analysis
python compute_wer.py ref.txt hyp.txt --print-top-wer --utt2spk utt2spk.txt
# Only score hypotheses that are present (skip missing)
python compute_wer.py ref.txt hyp.txt --mode present