Implementation:Speechbrain Speechbrain G2P Tool
| Knowledge Sources | |
|---|---|
| Domains | Text_Processing, G2P |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Concrete tool for grapheme-to-phoneme conversion using a pretrained G2P model provided by the SpeechBrain library.
Description
This is a command-line convenience script that transcribes text into phoneme sequences using a pretrained Grapheme-to-Phoneme (G2P) model loaded via speechbrain.inference.text.GraphemeToPhoneme. It supports three operating modes: (1) interactive shell mode for exploratory analysis where users can enter text line by line, (2) single-text mode for transcribing a single example from the command line, and (3) file mode for batch-processing text files of arbitrary size with results written to an output file. The file mode processes inputs in configurable batch sizes with progress tracking. Phoneme output uses a space-separated format with <spc> tokens for word boundaries. The G2P models can be trained using the recipes located in recipes/LibriSpeech/G2P.
Usage
Use as a CLI tool for converting orthographic text to phoneme sequences. Particularly useful for preprocessing text datasets for TTS (Text-to-Speech) training where phoneme inputs are preferred over raw graphemes.
Code Reference
Source Location
- Repository: SpeechBrain
- File: tools/g2p.py
Signature
def transcribe_text(g2p, text):
"""Transcribes a single line of text and outputs it."""
...
def transcribe_file(g2p, text_file_name, output_file_name=None, batch_size=64):
"""Transcribes a file with one example per line."""
...
def get_line_count(text_file_name):
"""Counts the lines in a file without loading it into memory."""
...
def transcribe_stream(g2p, text_file, output_file, batch_size=64, total=None):
"""Transcribes a file stream in batches."""
...
def chunked(iterable, batch_size):
"""Break iterable into lists of length batch_size."""
...
Import
# Interactive mode
python g2p.py --model /path/to/model --interactive
# Single text
python g2p.py --model /path/to/model --text "Hello world"
# File mode
python g2p.py --model /path/to/model --text-file input.txt --output-file phonemes.txt
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| --model | str | Yes | Path to the pretrained G2P model directory |
| --hparams | str | No | Name of the hyperparameter file (default: hyperparams.yaml) |
| --text | str | No | Single text string to transcribe |
| --text-file | str | No | Path to a text file with one sample per line |
| --output-file | str | No | Path for output file (stdout if omitted) |
| -i, --interactive | flag | No | Launch an interactive transcription shell |
Outputs
| Name | Type | Description |
|---|---|---|
| phonemes | str | Space-separated phoneme sequence with <spc> as word boundary |
| output_file | file | File containing phoneme transcriptions, one per line |
Usage Examples
# Start an interactive G2P shell
python g2p.py --model /path/to/g2p_model --interactive
# > Enter text: The cat sat on the mat
# DH AH <spc> K AE T <spc> S AE T <spc> AA N <spc> DH AH <spc> M AE T
# Transcribe a single sentence
python g2p.py --model /path/to/g2p_model --text "This is a line of text"
# Batch transcribe a file for TTS preprocessing
python g2p.py --model /path/to/g2p_model \
--text-file sentences.txt \
--output-file phonemes.txt